Pedro Bertoluchi

What keeps a RAG alive after six months

A field note on golden sets, observability, and the parts of retrieval that are easy to overlook until they break the team's trust.

7 min read
Back to blog

Most RAG proofs of concept work. The interesting question is what happens at month six, when the source content has shifted, the model has changed, and the team has moved on.

Three things keep it alive. A versioned golden set of fifty to one hundred questions with reference answers, run on every deploy. Observability that correlates question, retrieved context and final answer. A short path to rebuild embeddings when the frontier model changes.

Treat indexer, ranker and generator as separate units with their own metrics. When quality drops, the diagnostic shortlist is short: weak retrieval and brittle prompting fail differently.

Cost is a hard constraint. Per-user tracking, daily anomaly alerts, per-team dashboards. Without that, RAG becomes a quiet hole in the monthly bill.

Tags

  • #rag
  • #applied-ai
  • #azure-openai

Let's talk about your next project.

Share the challenge in a few lines. Within one business day I respond with a technical assessment and the next steps.