Chunk Relevance measures whether a given chunk of text contains information that could help answer the user’s query in a RAG pipeline.
- The chunk contains at least some information that is useful to answer the query (even partially)
- The chunk provides a key piece of information (such as the name of an entity or a specific fact) that can be used to find the answer in another chunk
- Any part of the chunk helps answer the query either partially or completely
- The chunk contains no useful information for answering the query
- The chunk is completely off-topic or only contains background/tangential information with no bearing on the query
- The content is topically related but doesn’t answer the question even partially
We do not require the chunk to fully answer the query—even partial relevance is sufficient. We do not penalize for incompleteness; as long as something in the chunk is relevant, we label it as Relevant.
Calculation method
Chunk Relevance is computed through a multi-step process:Model Request
Additional evaluation requests are sent to an LLM to analyze each chunk’s relevance to the user query.
Independent Evaluation
Each chunk is evaluated independently against the query to determine its relevance, without using outside knowledge or making assumptions.
Binary Classification
Each chunk receives a binary classification: Relevant (true) if it provides any useful information, or Not Relevant (false) if it contains no useful information for the query.
This metric is computed by prompting an LLM, and thus requires additional LLM calls to compute, which may impact usage and billing.
Understanding chunk relevance
Example Scenario
User query: “What is the population of the capital city of France?”
- Chunk 1: “The capital city of France is Paris.”
- Chunk 2: “Paris has a population of approximately 2.1 million people.”
- Chunk 3: “France is a country located in Western Europe.”
Relevance analysis: Chunks 1 and 2 are Relevant because they provide information directly related to answering the query. Chunk 1 provides crucial context (the capital is Paris) that enables the answer to be found, and Chunk 2 provides the actual answer. Chunk 3 is Not Relevant because it only provides general background information about France’s location, which doesn’t help answer the population question.
Optimizing your RAG pipeline
Addressing Low Relevance Scores
Improve retrieval quality: Refine embedding models, similarity search algorithms, or retrieval parameters to better match queries with relevant content.
Optimize chunking strategy: Ensure chunks are semantically coherent and contain complete information units rather than arbitrary text splits.
Adjust retrieval parameters: Experiment with different Top K values, similarity thresholds, or reranking strategies to improve relevance.
Analyze patterns: Identify common characteristics of non-relevant chunks to understand why the retrieval system is selecting them.
Best practices
Combine with Other Metrics
Chunk Relevance works alongside Context Precision and Precision @ K for a comprehensive view of retrieval effectiveness.
Optimize Retrieval Strategy
Relevance scores help refine embedding models, similarity search algorithms, and retrieval parameters.
Monitor Across Queries
Tracking relevance rates across different query types helps identify patterns and improve retrieval system performance.
Chunk Relevance is designed to be lenient—we mark chunks as relevant if they provide any useful information, even if incomplete. This ensures that chunks with partial answers or bridging information are not incorrectly marked as irrelevant.