Instruction Adherence

Instruction Adherence measures whether a model followed or adhered to the system or prompt instructions when generating a response.

How it Works

This metric is particularly valuable for uncovering hallucinations where the model is ignoring instructions, which can lead to responses that don’t meet user requirements or business rules.

Here’s a scale that shows the relationship between Instruction Adherence and the potential impact on your AI system:

Low Adherence

The model ignored its instructions when generating its response.

High Adherence

The model followed its instructions when generating its response.

Calculation Method

Instruction Adherence is computed through a multi-step process:

Model Evaluation

The system sends multiple evaluation requests to OpenAI’s GPT4o model to analyze whether the response follows the provided instructions.

Analysis Process

A specialized chain-of-thought prompt guides the model through a detailed evaluation of how well the response adheres to the specific instructions given.

Multiple Assessments

The system requests and collects multiple distinct responses to ensure a robust evaluation through consensus.

Result Generation

Each evaluation produces both a detailed explanation of the reasoning and a binary judgment (yes/no) on instruction adherence.

Score Calculation

The final score is computed as the ratio of positive (‘yes’) responses to the total number of evaluation responses.

We also surface one of the generated explanations, always choosing one that aligns with the majority judgment among the responses.

This metric is computed by prompting an LLM multiple times, and thus requires additional LLM calls to compute, which may impact usage and billing.

Understanding Instruction Adherence

Differentiating from Context Adherence

It’s important to understand the distinction between related metrics:

Instruction Adherence: Measures whether the response follows the instructions in your prompt template.

Context Adherence: Measures whether the response adheres to the context provided (e.g., your retrieved documents).

Optimizing Your AI System

Addressing Low Instruction Adherence

When a response has a low Instruction Adherence score, the model likely ignored its instructions. To improve your system:

Flag and examine non-compliant responses: Identify patterns in responses that don’t follow instructions.

Experiment with prompt engineering: Test different prompt formulations to find versions the model is more likely to adhere to.

Implement guardrails: Take precautionary measures to prevent non-compliant responses from reaching end users.

Consider model selection: Some models may be better at following instructions than others.

Best Practices

Clarify Instructions

Write clear, specific instructions without ambiguity or contradictions to improve adherence rates.

Prioritize Critical Instructions

Place the most important instructions prominently in your prompt and consider repeating them for emphasis.

Monitor Across Models

Compare Instruction Adherence scores across different LLMs to identify which models best follow your specific instructions.

Implement Feedback Loops

Use low-adherence examples to refine your prompts and create test cases for future prompt iterations.

When optimizing for Instruction Adherence, balance strict adherence with allowing the model some flexibility. Overly rigid instructions may limit the model’s ability to provide helpful responses in edge cases.

Overview

Get Started

How-to Guides

Cookbooks

Integrations

Concepts

SDK/API Reference

References

Instruction Adherence

How it Works

Calculation Method

Understanding Instruction Adherence

Differentiating from Context Adherence

Optimizing Your AI System

Addressing Low Instruction Adherence

Best Practices

Clarify Instructions

Prioritize Critical Instructions

Monitor Across Models

Implement Feedback Loops

Overview

Get Started

How-to Guides

Cookbooks

Integrations

Concepts

SDK/API Reference

References

​How it Works

​Calculation Method

​Understanding Instruction Adherence

Differentiating from Context Adherence

​Optimizing Your AI System

Addressing Low Instruction Adherence

​Best Practices

Clarify Instructions

Prioritize Critical Instructions

Monitor Across Models

Implement Feedback Loops

How it Works

Calculation Method

Understanding Instruction Adherence

Optimizing Your AI System

Best Practices