Chunk attribution
Chunk attribution measures whether or not each chunk retrieved in a RAG pipeline had an effect on the model’s response.
- The model incorporated information from the chunk into its response
- The chunk influenced the model’s reasoning or conclusions
- The chunk provided context that shaped the response in some way
Chunk Attribution is closely related to Chunk Utilization: Attribution measures whether or not a chunk affected the response, while Utilization measures how much of the chunk text
was involved in the effect. Only chunks that were Attributed can have Utilization scores greater than zero.
Understanding attribution
Example Scenario
User query: “What are the health benefits of green tea?”
- Chunk 1: “Green tea contains antioxidants that may reduce the risk of heart disease.”
- Chunk 2: “Black tea is produced by oxidizing tea leaves after they are harvested.”
- Chunk 3: “Studies suggest green tea may help with weight loss and metabolism.”
Model response: “Green tea offers several health benefits, including antioxidants that may reduce heart disease risk and potential effects on weight loss and metabolism.”
Attribution analysis: Chunks 1 and 3 would be Attributed because information from them appears in the response. Chunk 2 would be Not Attributed because it contains information about black
tea, which wasn’t included in the response.
Optimizing your RAG pipeline
Recommended Strategies
Tune retrieved chunk count: If many chunks are Not Attributed, reduce the number of chunks retrieved to improve efficiency without impacting quality.
Debug problematic responses: When responses are unsatisfactory, examine which chunks were attributed to identify the source of issues.
Improve retrieval quality: Use attribution data to refine your retrieval algorithms and embedding models.
Chunk utilization
Chunk Utilization measures the fraction of text in each retrieved chunk that had an impact on the model’s response in a RAG pipeline.
Chunk Utilization is closely related to Chunk Attribution: Attribution measures whether or not a chunk affected the response, while Utilization measures how much of the chunk text
was involved in the effect. Only chunks that were Attributed can have Utilization scores greater than zero.
Calculation method
1
Model Architecture
We use a fine-tuned in-house Galileo evaluation model based on a transformer encoder architecture.
2
Multi-metric Computation
The same model computes Chunk Adherence, Chunk Completeness, Chunk Attribution and Utilization in a single inference call.
3
Token-level Analysis
For each token in the provided context, the model outputs a utilization probability indicating if that token affected the response.
4
Score Calculation
Chunk Utilization is computed as the fraction of tokens with high utilization probability out of all tokens in the chunk.
5
Model Training
The model is trained on carefully curated RAG datasets and optimized to closely align with the RAG Plus metrics.
Optimizing your RAG pipeline
Addressing Low Utilization Scores
Oversized Chunks: Your chunks are longer than they need to be
- Check if chunk relevance is also low, which confirms this scenario
- Solution: Tune your retriever to return shorter, more focused chunks
- Benefits: Improved system efficiency, lower cost, and reduced latency
Ineffective LLM Utilization: The LLM generator model is failing to incorporate all relevant information
- Check if Chunk Relevance is high, which confirms this scenario
- Solution: Explore a different LLM that may leverage the relevant information more effectively
- Benefits: Better response quality and more efficient use of retrieved information
Best practices
Monitor Attribution Rates
Track the percentage of chunks that are attributed over time to identify trends and potential issues in your retrieval system.
Balance with Other Metrics
Use Chunk Attribution Utilization alongside Chunk Relevance for a complete picture of retrieval effectiveness.
Optimize Chunk Size
Experiment with different chunk sizes to find the optimal balance between attribution rates and information density.
Improve Retrieval Quality
Use attribution data to refine your retrieval algorithms and embedding models.
Monitor Across Models
Compare Chunk Utilization scores across different LLMs to identify which models most efficiently use retrieved information.
Analyze Patterns
Look for patterns in low-utilization chunks to identify specific content types or formats that your system processes inefficiently.
When optimizing for Chunk Attribution Utilization, be careful not to reduce the number of chunks too aggressively, as this may limit the model’s access to potentially useful information in edge cases.