Model Response: “Well, there are many aspects to climate change. Some people
think it’s caused by humans, and others think it’s just natural. It’s hard to
say exactly.”
What went wrong?
- The prompt did not provide enough context for confident decision-making
- The model allowed too much randomness in token selection
- The prompt was ambiguous in the response it expected
How it showed up in metrics:
- High Uncertainty: The model hesitated in its response
- High Prompt Perplexity: The model struggled with predicting the next token
- Mid-range Instruction Adherence: The model understood the instructions but lacked decisiveness
Improvements and solutions
For the following improvements, we will be showing how we could change a simple prompt script like the below example:1
Provide Stronger Context in Prompts
Include explicit guiding statements, for example:This should reduce the uncertaincy and perplexity in your metrics on Galileo.
2
Adjust Model Sampling Parameters
Lower temperature to make the model more deterministic, for example:Use top-k sampling to limit options and prevent hesitation, for example:Lowering the
temperature
and decreasing top_k
both generally increase the prompt adherence.3
Modify Prompt Structure
Use direct phrasing to force a single, clear response, for example:Avoid prompts that allow multiple equally valid answers, for example avoiding confusing by adding the source of the opinion we care about:
4
Apply Uncertainty-Based Filtering
Automatically reject responses with an Uncertainty score above a set threshold