Chunk Relevance
Understand how to measure and optimize the relevance of retrieved chunks to user queries in your RAG pipeline
Chunk Relevance measures the proportion of text in each retrieved chunk that contains useful information to address the user’s query in a RAG pipeline.
Chunk Relevance is a continuous metric ranging from 0 to 1:
0
0.5
1
Low Relevance
None of the chunk's content is relevant to the query
Mid Relevance
Only half of the chunk's content is relevant to the query
High Relevance
The entire chunk is useful for answering the query
A chunk with low relevance contains “unnecessary” text that is not pertinent to the user’s query, indicating potential inefficiencies in your retrieval strategy.
Calculation Method
Chunk Relevance is computed using a fine-tuned in-house Galileo evaluation model:
Model Type
A transformer-based encoder model trained to identify relevant information in the provided query, context, and response.
Unified Processing
The same model computes multiple metrics (Chunk Adherence, Chunk Completeness, Chunk Attribution, and Utilization) in a single inference call for efficiency.
Inference
All metrics are computed simultaneously in a single inference call, optimizing performance and resource usage.
Token Analysis
The model processes each token in the context and outputs a relevance probability - the likelihood that the token is useful for answering the query.
Training Data
The model is trained on carefully curated RAG datasets and optimized to align closely with established RAG Plus metrics for accurate evaluation.
Explainability
The model identifies which parts of the chunks were relevant to the query:
- These sections can be highlighted in your retriever nodes by clicking on the icon next to the Chunk Utilization metric value in your Retriever nodes
- This visualization helps you understand exactly which portions of your chunks are being utilized effectively
Optimizing Your RAG Pipeline
Addressing Low Relevance Scores
Low Chunk Relevance scores indicate that your chunks are probably longer than they need to be. To improve your system:
Tune your chunking strategy: Experiment with different chunking methods to create more focused chunks.
Reduce chunk size: Consider using smaller chunks that contain more concentrated relevant information.
Improve chunk boundaries: Ensure chunks are divided at logical content boundaries rather than arbitrary character counts.
Benefits: Improved system efficiency, lower cost, and reduced latency.
Best Practices
Refine Chunking Strategy
Experiment with different chunking methods (sentence-based, paragraph-based, semantic-based) to find the approach that maximizes relevance.
Optimize Retrieval Parameters
Adjust your retrieval parameters (k value, similarity thresholds) to prioritize chunks with higher relevance scores.
Combine with Other Metrics
Use Chunk Relevance alongside Chunk Utilization and Chunk Attribution for a complete picture of retrieval effectiveness.
Monitor Query-Chunk Alignment
Regularly analyze queries with low relevance scores to identify patterns and improve your document preprocessing or embedding strategy.
When optimizing for Chunk Relevance, remember that the goal is to retrieve chunks that contain information relevant to the query. Extremely high relevance might come at the cost of missing important context if chunks are too small or too narrowly focused.