Chunk Utilization

Chunk Utilization measures the fraction of text in each retrieved chunk that had an impact on the model’s response in a RAG pipeline.

Chunk Utilization is a continuous metric ranging from 0 to 1:

0.5

Low Utilization

None of the chunk's content was utilized

Mid Utilization

Only half of the chunk's content influenced the response

High Utilization

The entire chunk's content influenced the response

A chunk with low utilization contains “extraneous” text that did not affect the final response, indicating potential inefficiencies in your chunking strategy.

Chunk Utilization is closely related to Chunk Attribution: Attribution measures whether or not a chunk affected the response, while Utilization measures how much of the chunk text was involved in the effect. Only chunks that were Attributed can have Utilization scores greater than zero.

Calculation method

Model Architecture

We use a fine-tuned in-house Galileo evaluation model based on a transformer encoder architecture.

Multi-metric Computation

The same model computes Chunk Adherence, Chunk Completeness, Chunk Attribution and Utilization in a single inference call.

Token-level Analysis

For each token in the provided context, the model outputs a utilization probability indicating if that token affected the response.

Score Calculation

Chunk Utilization is computed as the fraction of tokens with high utilization probability out of all tokens in the chunk.

Model Training

The model is trained on carefully curated RAG datasets and optimized to closely align with the RAG Plus metrics.

Optimizing your RAG pipeline

Addressing Low Utilization Scores

Low Chunk Utilization scores could indicate one of two scenarios:

Oversized Chunks: Your chunks are longer than they need to be

Check if Chunk Relevance is also low, which confirms this scenario
Solution: Tune your retriever to return shorter, more focused chunks
Benefits: Improved system efficiency, lower cost, and reduced latency

Ineffective LLM Utilization: The LLM generator model is failing to incorporate all relevant information

Check if Chunk Relevance is high, which confirms this scenario
Solution: Explore a different LLM that may leverage the relevant information more effectively
Benefits: Better response quality and more efficient use of retrieved information

Best practices

Optimize Chunk Size

Experiment with different chunking strategies to find the optimal chunk size that maximizes utilization without sacrificing relevance.

Monitor Across Models

Compare Chunk Utilization scores across different LLMs to identify which models most efficiently use retrieved information.

Combine with Other Metrics

Use Chunk Utilization alongside Chunk Relevance and Chunk Attribution for a complete picture of retrieval effectiveness.

Analyze Patterns

Look for patterns in low-utilization chunks to identify specific content types or formats that your system processes inefficiently.

When optimizing for Chunk Utilization, balance efficiency with comprehensiveness. Extremely high utilization might indicate chunks that are too small and lack sufficient context for the model.

Overview

Get Started

How-to Guides

Cookbooks

Integrations

Concepts

SDK/API Reference

References

Chunk Utilization

Calculation method

Optimizing your RAG pipeline

Addressing Low Utilization Scores

Best practices

Optimize Chunk Size

Monitor Across Models

Combine with Other Metrics

Analyze Patterns

Overview

Get Started

How-to Guides

Cookbooks

Integrations

Concepts

SDK/API Reference

References

​Calculation method

​Optimizing your RAG pipeline

Addressing Low Utilization Scores

​Best practices

Optimize Chunk Size

Monitor Across Models

Combine with Other Metrics

Analyze Patterns

Calculation method

Optimizing your RAG pipeline

Best practices