Skip to main content

Context Precision measures the percentage of retrieved chunks that are relevant to the user’s query, helping identify how much noise or unwanted information was retrieved.
Context Precision is a continuous metric ranging from 0 to 1:
01
Low Precision
Few or no retrieved chunks are relevant to the query
High Precision
Most or all retrieved chunks are relevant to the query
This metric helps evaluate the quality of retrieval systems by measuring how many of the retrieved chunks actually contain useful information for answering the query. Higher Context Precision indicates that the retrieval system is successfully identifying and returning relevant content with minimal noise.

Calculation method

Context Precision is computed using Chunk Relevance scores:
1

Chunk Relevance Calculation

First, Chunk Relevance is computed for each retrieved chunk, producing a binary classification (Relevant or Not Relevant) for each chunk.
2

Numerator Calculation

The numerator is calculated as the sum of (1.0 if the chunk is relevant, else 0.0) divided by N, where N is the position index starting from 1 (position 1, 2, 3, etc.).
3

Denominator Calculation

The denominator is the sum of 1/N for all chunks, where N is the position index starting from 1 (position 1, 2, 3, etc.).
4

Final Score

Context Precision is computed as the ratio of the numerator to the denominator, ensuring the metric ranges from 0 to 1.

Understanding context precision

Example Scenario

This example illustrates Context Precision:
User query: “What are the health benefits of green tea?”
  • Position 1: “Green tea contains antioxidants…” (Relevant)
  • Position 2: “Green tea may help with weight loss…” (Relevant)
  • Position 3: “Black tea is produced by oxidizing…” (Not Relevant)
  • Position 4: “Studies suggest green tea…” (Relevant)
  • Positions 5-10: Various other chunks (mix of Relevant and Not Relevant)
Analysis: If 3 out of 10 chunks are relevant, Context Precision calculates the score based on which chunks are relevant and their positions. This helps identify how much noise or unwanted information was retrieved.
Context Precision is differentiated from Precision @ K: Context Precision considers all retrieved chunks to measure noise in retrieval, while Precision @ K evaluates precision at a specific rank K and helps assess ranking quality.

Optimizing your RAG pipeline

Addressing Low Context Precision Scores

When Context Precision scores are low, it indicates that many retrieved chunks are not relevant to the query, meaning there is significant noise in the retrieved results. To improve the system:
Improve retrieval quality: Refine embedding models, similarity search algorithms, or retrieval parameters to better match queries with relevant content.
Implement reranking: Use a reranking model to improve the order of retrieved chunks, ensuring the most relevant ones appear first.
Adjust Top K: If precision is consistently low, consider reducing the number of chunks retrieved (Top K) to focus on higher-quality results.
Analyze retrieval patterns: Examine which types of queries or content lead to low precision scores to identify systematic issues.

Comparing Context Precision and Precision @ K

Understanding Metric Combinations

Different combinations of Context Precision and Precision @ K scores reveal different aspects of retrieval system performance:
High Context Precision, High Precision @ K: The retrieval system is performing well overall. Most retrieved chunks are relevant, and the ranking is effective, with relevant chunks appearing in the top positions. This indicates minimal noise and good ranking quality.
High Context Precision, Low Precision @ K: While the overall retrieval contains mostly relevant chunks (low noise), the ranking is poor. Relevant chunks are distributed throughout the retrieved set rather than concentrated in the top K positions. This suggests the retrieval system finds relevant content but needs better ranking or reranking.
Low Context Precision, High Precision @ K: The top K positions contain mostly relevant chunks (good ranking), but the overall retrieved set has significant noise. This indicates that while the ranking algorithm prioritizes relevant content effectively, the retrieval system is bringing back too many irrelevant chunks beyond the top K. Consider reducing Top K or improving retrieval quality.
Low Context Precision, Low Precision @ K: Both metrics indicate problems. The retrieval system has high noise (many irrelevant chunks) and poor ranking (relevant chunks are not in top positions). This suggests fundamental issues with both retrieval quality and ranking that need to be addressed.

Best practices

Use for Retrieval Assessment

Context Precision helps evaluate the overall quality of retrieval system results and determine how accurately they adhere to queries.

Combine with Precision @ K

Context Precision works alongside Precision @ K to understand both overall precision and precision at specific ranks.

Monitor Retrieval Quality

Tracking Context Precision helps identify patterns in retrieval noise and understand how much unwanted information is being retrieved.

Optimize Retrieval Parameters

Context Precision scores guide adjustments to Top K values, similarity thresholds, and reranking strategies.
When optimizing for Context Precision, the metric helps identify how much noise or unwanted information is present in the retrieved chunks, enabling improvements to retrieval quality.