Response quality metrics help you measure how well your AI system answers user questions, follows instructions, and provides useful information. These metrics are key for building reliable, helpful, and user-friendly AI applications.

Use these metrics when you want to:

  • Ensure your AI’s responses are factually correct and complete.
  • Check that the model follows instructions and uses retrieved information effectively.
  • Evaluate how well your system grounds answers in context or source material.

Below is a quick reference table of all response quality metrics:

NameDescriptionWhen to UseExample Use Case
Chunk AttributionAssesses whether the response properly attributes information to source documents.When implementing RAG systems and want to ensure proper attribution.A legal research assistant that must cite specific cases and statutes when providing legal information.
Chunk UtilizationMeasures how effectively the model uses the retrieved chunks in its response.When optimizing RAG performance to ensure retrieved information is used efficiently.A technical support chatbot that needs to incorporate relevant product documentation in troubleshooting responses.
CompletenessMeasures whether the response addresses all aspects of the user's query.When evaluating if responses fully address the user's intent.A healthcare chatbot that must address all symptoms mentioned by a patient when suggesting next steps.
Context AdherenceMeasures how well the response aligns with the provided context.When you want to ensure the model is grounding its responses in the provided context.A financial advisor bot that must base investment recommendations on the client's specific financial situation and goals.
Context Relevance (Query Adherence)Evaluates whether the retrieved context is relevant to the user's query.When assessing the quality of your retrieval system's results.An internal knowledge base search that retrieves company policies relevant to specific employee questions.
Correctness (factuality)Evaluates the factual accuracy of information provided in the response.When accuracy of information is critical to your application.A medical information system providing drug interaction details to healthcare professionals.
Ground Truth AdherenceMeasures how well the response aligns with established ground truth.When evaluating model responses against known correct answers.A customer service AI that must provide accurate product specifications from an official catalog.
Instruction AdherenceAssesses whether the model followed the instructions in your prompt template.When using complex prompts and need to verify the model is following all instructions.A content generation system that must follow specific brand guidelines and formatting requirements.

Next Steps