Building RAG Update: Hybrid Search, Reranking & Production Hardening
What changed after running a production RAG system for four months: hybrid search, reranking, and hardening lessons for Java and Kubernetes stacks.…
What changed after running a production RAG system for four months: hybrid search, reranking, and hardening lessons for Java and Kubernetes stacks.…
Why Kubernetes for LLM workloads: GPU scheduling, autoscaling, and serving models like Gemma in a production-grade Java RAG system. Part 3 of the series.…
Architecture deep dive of a production RAG system in Java 25 and Spring Boot WebFlux: service boundaries, retriever design, and tradeoffs explained.…
The real production challenges of RAG systems: latency, reliability, cost, quality, and observability. Part 1 of building production-grade RAG in Java.…
Hands-on intro to DJL (Deep Java Library): building a speech recognition app with an engine-agnostic deep learning framework for Java developers.…
How Pixie brings instant, eBPF-powered observability to Kubernetes: debug services, spot bottlenecks, and profile apps without changing code.…
What CRaC (Coordinated Restore at Checkpoint) means for Java: instant JVM startup, Azul Zulu support, and Spring 6.1 integration explained simply.…
A software engineer's home office setup in Stockholm: desk, monitor, keyboard, audio, and lighting choices explained after months of remote work.…