Model Confidence Metrics

Next steps

Model confidence metrics help you gauge how certain your AI is about its answers. These metrics are useful for flagging uncertain responses, improving reliability, and knowing when to involve a human in the loop. Use these metrics when you want to:

Identify responses where the model is unsure or likely to make mistakes.
Improve user trust by surfacing confidence scores or warnings.
Analyze which prompts or situations are most challenging for your AI.

Below is a quick reference table of all model confidence metrics:

Name	Description	When to Use	Example Use Case
Uncertainty	Measures the model’s confidence in its generated response.	When you want to understand how certain the model is about its answers.	Flagging responses where the model is unsure, so a human can review them before sending to a user.
Prompt Perplexity	Evaluates how difficult or unusual the prompt is for the model to process.	When you want to identify prompts that may confuse the model or lead to lower-quality responses.	Detecting outlier prompts in a customer support chatbot to improve prompt engineering.

Next steps

BLEU and ROUGE

Uncertainty

⌘I

Overview

Get Started

Logging and Monitoring

Experiments

Runtime Protection

Metrics

Annotations

Integrations

Security

References

Model Confidence Metrics

Next steps

Overview

Get Started

Logging and Monitoring

Experiments

Runtime Protection

Metrics

Annotations

Integrations

Security

References

​Next steps

Next steps