Fixing Hallucinations and Factual Errors

Hallucinations occur when an AI model generates factually incorrect information that is not grounded in any real-world knowledge. These are sometimes called “open-domain factual errors” and can significantly impact the reliability of your AI system. In this particular guide, we’re going to focus on hallucinations that have to do with “open-domain” facts - as opposed to hallucinations that have to do with “closed-domain” facts, which are more common in the context of RAG systems.

Setup

Consider the following prompt:

Which are better: Sharks or Tigers? Explain why. Answer succinctly, you must choose one. 

And the following response:

Tigers are better because they are apex predators with a versatile skill set, capable of thriving in diverse terrestrial environments, whereas sharks are specialized for aquatic life. Tigers have complex social behaviors and are more adaptable to a range of habitats compared to sharks.

As we can see, the model cites a number of facts about sharks and tigers. While they are individually correct, the response as a whole also claims that these facts prove that tigers are better than sharks. This is a subjective statement, and the model has no way to know whether this is true or not.

What Went Wrong?

Let’s look at the explanation:

The response has a 66.67% chance of being factually correct.

The response makes several claims regarding why tigers are better than sharks:

Tigers are apex predators: This claim is factually correct as tigers are considered apex predators in their terrestrial habitats, primarily preying on larger herbivores.
Tigers have a versatile skill set and can thrive in diverse terrestrial environments: This claim is true as tigers can adapt to various habitats, including forests, grasslands, and swamps, showcasing their versatility in hunting and survival skills.
Sharks are specialized for aquatic life: This claim is also correct. Sharks have evolved as top predators in marine environments, having physical adaptations, such as streamlined bodies and gills, making them highly effective in water.
Tigers have complex social behaviors and are more adaptable to a range of habitats than sharks: While it is true that tigers display some level of social interaction, such as mother-cub relationships, they are generally solitary animals. The adaptability claim is partially accurate; however, it should be noted that sharks are well adapted to their own environments, but they are not adaptable to land environments due to their physiological needs.

Overall, while the response presents some correct information about both species, the assertion that tigers are ‘better’ than sharks is subjective and lacks a definitive basis since ‘better’ can depend on context and criteria. In terms of factual correctness, the claims made do not fundamentally present falsehoods but the comparison drawn lacks a nuanced view of the strengths of both animals.

As the explanation suggests, the root cause of open-domain hallucination are quite often due to ambigouty in the prompt, lack of constraints on the response, lack of context - or a combination of these factors.

When models hallucinate, it’s typically due to one or more of these issues:

No constraints on responses: The model wasn’t instructed to limit responses to known facts
Allowed speculation: The model was permitted to make claims without verification
Lack of validation: No post-processing was implemented to validate factual correctness

How It Appears in Metrics

Hallucinations typically manifest in evaluation metrics as:

Low Correctness scores: Responses contain factual inaccuracies
Uncertainty indicators: The model shows some level of uncertainty about of the response

Solutions to Prevent Hallucinations

Explicitly Instruct the Model to Avoid Guessing

Modify your prompts to discourage speculation:

Instruction: Only provide answers that are scientifically verified. 
If you're unsure about something, say 'I don't know' rather than guessing.

Implementation tips:

Add explicit uncertainty instructions to your system prompt
Include examples of good “I don’t know” responses in few-shot prompts
Reward the model for admitting uncertainty rather than making up answers
Consider phrases like “It’s important to be accurate rather than comprehensive”

Add Response Validation

Implement automated validation using Correctness scores
Flag responses with a low Ground Truth Adherence score for human review
Consider using retrieval-augmented generation (RAG) to ground responses in verified sources

Validation techniques:

Use Galileo’s Ground Truth Adherence metric to compare responses against known facts
Implement confidence thresholds where low-confidence answers trigger fallback responses
For critical applications, create a verification pipeline with multiple checks
Consider a “cited response only” approach where all factual claims must reference sources

Implement Post-Processing Checks

Run additional LLM-based checks for factual accuracy before serving responses
Filter out responses that exceed an uncertainty threshold
Consider implementing a citation system for factual claims

Advanced verification methods:

Create a separate “fact-checking” LLM call that evaluates the main response
Implement a multi-agent approach where one agent generates and another verifies
Use structured output formats that separate facts from opinions/interpretations
For high-stakes domains, maintain a database of verified facts to check against

Design for Transparency

Clearly communicate the model’s limitations to users
Provide confidence scores alongside responses
Distinguish between factual statements and opinions/interpretations
Consider visual indicators for different levels of certainty

User experience considerations:

Use UI elements like confidence meters or color-coding for certainty levels
Provide “View Sources” options for factual claims
Design graceful fallbacks when the model is uncertain
Consider allowing users to provide feedback on factual accuracy

Monitor and Continuously Improve

Use Galileo to track hallucination rates over time
Collect examples of hallucinations to create targeted test cases
Regularly audit responses, especially for high-risk domains
Implement feedback loops to improve the system based on real-world performance

Monitoring strategy:

Set up custom metrics in Galileo specifically for tracking hallucination types
Create a “hallucination dataset” of challenging examples for regression testing
Implement user feedback mechanisms to flag potential hallucinations
Schedule regular reviews of flagged responses to identify patterns

Best Practices

Layer your defenses

Combine multiple approaches rather than relying on a single method. Use a combination of prompt engineering, validation, and post-processing checks.

Domain-specific knowledge

For specialized domains, create custom knowledge bases or retrieval systems that contain verified information relevant to your application.

Calibrate confidence

Train your system to accurately represent its confidence level. Models should express appropriate uncertainty when dealing with ambiguous or incomplete information.

Progressive disclosure

Consider revealing information in stages, with increasing verification for more detailed claims. Start with high-confidence facts before providing specifics.

Continuous evaluation

Regularly test your system against new examples of potential hallucinations. Build a test suite of challenging cases that target known weaknesses.

Human-in-the-loop

For critical applications, maintain human oversight for final verification. Design workflows where humans can efficiently review and correct model outputs.

Feedback integration

Create mechanisms to learn from identified hallucinations. Use feedback loops to continuously improve your system’s factual accuracy over time.

Contextual awareness

Adjust verification stringency based on the stakes of the interaction. Apply more rigorous checks for high-risk domains like healthcare or finance.

Overview

Getting Started

SDK/API

How-to Guides

Cookbooks

Concepts

Fixing Hallucinations and Factual Errors

Setup

What Went Wrong?

How It Appears in Metrics

Solutions to Prevent Hallucinations

Best Practices

Layer your defenses

Domain-specific knowledge

Calibrate confidence

Progressive disclosure

Continuous evaluation

Human-in-the-loop

Feedback integration

Contextual awareness

Overview

Getting Started

SDK/API

How-to Guides

Cookbooks

Concepts

​Setup

​What Went Wrong?

​How It Appears in Metrics

​Solutions to Prevent Hallucinations

​Best Practices

Layer your defenses

Domain-specific knowledge

Calibrate confidence

Progressive disclosure

Continuous evaluation

Human-in-the-loop

Feedback integration

Contextual awareness

Setup

What Went Wrong?

How It Appears in Metrics

Solutions to Prevent Hallucinations

Best Practices