Action Advancement
Understand how to measure and optimize the effectiveness of your AI agent’s actions
Action Advancement measures whether an assistant successfully accomplishes or makes progress toward at least one user goal in a conversation.
An assistant successfully advances a user’s goal when it:
- Provides a complete or partial answer to the user’s question
- Requests clarification or additional information to better understand the user’s needs
- Confirms that a requested action has been successfully completed
For an interaction to count as advancing the user’s goal, the assistant’s response must be:
- Factually accurate
- Directly addressing the user’s request
- Consistent with any tool outputs used
Calculation Method
If the Action Advancement score is less than 100%, it means at least one evaluator determined the assistant failed to make progress on any user goal.
Action Advancement is calculated by:
Model Request
Multiple evaluation requests are sent to an LLM evaluator (e.g., OpenAI’s GPT4o-mini) to analyze the assistant’s progress toward user goals.
Prompt Engineering
A specialized chain-of-thought prompt guides the model to evaluate whether the assistant made progress on user goals based on the metric’s definition.
Evaluation Process
Each evaluation analyzes the interaction and produces both a detailed explanation and a binary judgment (yes/no) on goal advancement.
Score Calculation
The final Action Advancement score is computed as the percentage of positive (‘yes’) responses out of all evaluation responses.
We display one of the generated explanations alongside the score, always choosing one that aligns with the majority judgment.
This metric requires multiple LLM calls to compute, which may impact usage and billing.
Understanding Action Advancement
When to Use This Metric
Action Advancement is particularly valuable for evaluating:
Agentic Workflows: When an AI agent must decide on actions and select appropriate tools.
Multi-step Tasks: When completing a user’s request requires multiple steps or decisions.
Tool-using Assistants: When evaluating if the assistant used available tools effectively.
This metric helps you determine whether the assistant chose appropriate actions and made meaningful progress toward fulfilling the user’s request.
Best Practices
Track Progress Over Time
Monitor Action Advancement scores across different versions of your agent to ensure improvements in task completion capabilities.
Analyze Failure Patterns
When Action Advancement scores are low, examine the specific steps where agents fail to make progress to identify systematic issues.
Combine with Other Metrics
Use Action Advancement alongside other agentic metrics to get a comprehensive view of your assistant’s effectiveness.
Test Edge Cases
Create evaluation datasets that include complex, multi-step tasks to thoroughly assess your agent’s ability to advance user goals.
When optimizing for Action Advancement, ensure you’re not sacrificing other important aspects like safety, factual accuracy, or user experience in pursuit of task completion.