Laytoun' thoughts!

Building Production-Grade RAG Systems: Kubernetes, Autoscaling & LLMs

We finally got first drop of snow this week in Stockholm. The eather is getting colder and days shorter. That only motivates me to continue writing my third and final post in the RAG series. In part one, we explored the production challenges of RAG systems. In part two, we

Building Production-Grade RAG Systems: Architecture Deep Dive

In the first part, we explored the production challenges of RAG systems: latency, reliability, cost, quality, and observability. Now let's get our hands dirty with the actual architecture and implementation. The codebase uses Java 25, Spring Boot 3.5.7, reactive programming with WebFlux, and follows production patterns you'd see

Building Production-Grade RAG Systems: Understanding the Problem Space

I've been quiet on this blog for a while now. Truth is, I lost my appetite for writing these past months. Between traveling to conferences, delivering talks, and shipping some cool features at work, the keyboard just didn't feel the same. There was also this nagging voice in my head:

A look into Deep Java Library!

When you think about building machine learning apps, Java is not the first language that comes to mind, probably not even in the top 3 or 5! But Java has proved time and again that it is capable of modernising itself, and even if it's not the first choice for

What the CRaC ?!

If you've been following the news lately in the Java ecosystem (aside from Java 28th anniversary), you should've heard of CRaC. Two big announcements were revealed this week: * Azul announced earlier this week the general availability of and commercial support for Azul Zulu Builds of OpenJDK for Java 17 including

Laytoun' thoughts! © 2026