Chunk Relevance measures the proportion of text in each retrieved chunk that contains useful information to address the user’s query in a RAG pipeline.

Chunk Relevance is a continuous metric ranging from 0 to 1:

0

0.5

1

Low Relevance

None of the chunk's content is relevant to the query

Mid Relevance

Only half of the chunk's content is relevant to the query

High Relevance

The entire chunk is useful for answering the query

A chunk with low relevance contains “unnecessary” text that is not pertinent to the user’s query, indicating potential inefficiencies in your retrieval strategy.

Calculation Method

Chunk Relevance is computed using a fine-tuned in-house Galileo evaluation model:

1

Model Type

A transformer-based encoder model trained to identify relevant information in the provided query, context, and response.

2

Unified Processing

The same model computes multiple metrics (Chunk Adherence, Chunk Completeness, Chunk Attribution, and Utilization) in a single inference call for efficiency.

3

Inference

All metrics are computed simultaneously in a single inference call, optimizing performance and resource usage.

4

Token Analysis

The model processes each token in the context and outputs a relevance probability - the likelihood that the token is useful for answering the query.

5

Training Data

The model is trained on carefully curated RAG datasets and optimized to align closely with established RAG Plus metrics for accurate evaluation.

Explainability

The model identifies which parts of the chunks were relevant to the query:

  • These sections can be highlighted in your retriever nodes by clicking on the icon next to the Chunk Utilization metric value in your Retriever nodes
  • This visualization helps you understand exactly which portions of your chunks are being utilized effectively

Optimizing Your RAG Pipeline

Addressing Low Relevance Scores

Low Chunk Relevance scores indicate that your chunks are probably longer than they need to be. To improve your system:

Tune your chunking strategy: Experiment with different chunking methods to create more focused chunks.

Reduce chunk size: Consider using smaller chunks that contain more concentrated relevant information.

Improve chunk boundaries: Ensure chunks are divided at logical content boundaries rather than arbitrary character counts.

Benefits: Improved system efficiency, lower cost, and reduced latency.

Best Practices

Refine Chunking Strategy

Experiment with different chunking methods (sentence-based, paragraph-based, semantic-based) to find the approach that maximizes relevance.

Optimize Retrieval Parameters

Adjust your retrieval parameters (k value, similarity thresholds) to prioritize chunks with higher relevance scores.

Combine with Other Metrics

Use Chunk Relevance alongside Chunk Utilization and Chunk Attribution for a complete picture of retrieval effectiveness.

Monitor Query-Chunk Alignment

Regularly analyze queries with low relevance scores to identify patterns and improve your document preprocessing or embedding strategy.

When optimizing for Chunk Relevance, remember that the goal is to retrieve chunks that contain information relevant to the query. Extremely high relevance might come at the cost of missing important context if chunks are too small or too narrowly focused.