Skip to main content

Overview

Action Advancement measures whether an assistant successfully accomplishes or makes progress toward at least one user goal in a conversation.
Action Advancement addresses the common pain points of unclear agent performance by measuring whether AI agents are actually helping users achieve their objectives rather than just providing responses. An assistant successfully advances a user’s goal when it:
  1. Provides a complete or partial answer to the user’s question
  2. Requests clarification or additional information to better understand the user’s needs
  3. Confirms that a requested action has been successfully completed
For an interaction to count as advancing the user’s goal, the assistant’s response must be:
  • Factually accurate
  • Directly addressing the user’s request
  • Consistent with any tool outputs used

Action Advancement at a glance

PropertyDescription
Name of MetricAction Advancement
Metric CategoryAgentic Metrics
Use this metric forEvaluating whether AI agents make progress toward user goals in conversations
Can be applied tosession, trace, all span types (agent, workflow, retriever, LLM and tool)
LLM/Luna SupportSupported with both LLM + Luna models
Protect Runtime ProtectionNo - Not applicable for this metric
ConstantsNone - Uses dynamic evaluation
Usage ContextAgentic workflows, multi-step tasks, tool-using assistants
Value TypeConfidence score (0.0 to 1.0) - Confidence that any one action has advanced
Input/Output RequirementsRequires conversation context, user goals, and assistant responses

When to Use This Metric

When to Use This Metric

This metric shines when simple response quality metrics fall short, particularly for complex, multi-step interactions where progress toward goals matters more than individual response quality.
Agentic Workflows: When an AI agent must decide on actions and select appropriate tools.
Multi-step Tasks: When completing a user’s request requires multiple steps or decisions.
Tool-using Assistants: When evaluating if the assistant used available tools effectively.
Customer Service Agents: Resolving user issues through multi-step problem-solving.
Task-Oriented Assistants: Completing specific actions like booking flights or processing orders.
Research Assistants: Gathering and synthesizing information across multiple sources.
Creative Assistants: Understanding and building upon user requests iteratively.

Calculation method

If the Action Advancement score is less than 100%, it means at least one evaluator determined the assistant failed to make progress on any user goal. Action Advancement is calculated by:
1

Model Request

Multiple evaluation requests are sent to an LLM evaluator to analyze the assistant’s progress toward user goals.
2

Prompt Engineering

A specialized chain-of-thought prompt guides the model to evaluate whether the assistant made progress on user goals based on the metric’s definition.
3

Evaluation Process

Each evaluation analyzes the interaction and produces both a detailed explanation and a binary judgment (yes/no) on goal advancement.
4

Score Calculation

The final Action Advancement score is computed as the confidence score or probability that any one user ask is advanced.
We display one of the generated explanations alongside the score, choosing one that aligns with the majority judgment.
This metric requires multiple LLM calls to compute, which may impact usage and billing.

Score Interpretation

Expected Score: 1.0 (Excellent) - The assistant made clear progress toward the booking goal by gathering necessary information and providing options.
0.00.51.0
Poor
Assistant failed to make any progress toward user goals
Fair
Assistant made some progress but didn't fully address the user's needs
Excellent
Assistant successfully advanced user goals with clear progress

What different scores mean

  • 0.0 - 0.3 (Poor): The assistant completely failed to address the user’s request or made no meaningful progress. Common causes include ignoring the user’s question, providing irrelevant information, or failing to use available tools when needed.
  • 0.4 - 0.7 (Fair): The assistant made some progress but didn’t fully accomplish the user’s goal. This might include partial answers, requesting clarification when not needed, or missing key aspects of the request.
  • 0.8 - 1.0 (Excellent): The assistant successfully advanced the user’s goal by providing complete answers, making appropriate requests for clarification, or confirming successful task completion.

How to improve Action Advancement scores

To improve Action Advancement scores, focus on ensuring your AI agents make meaningful progress toward user goals in every interaction.

Common issues and solutions

IssueCauseSolution
Assistant ignores user requestsPoor prompt engineering or context understandingImprove system prompts to emphasize goal-oriented responses and ensure the assistant understands user intent
Incomplete responsesInsufficient context or tool usageProvide better context and ensure the assistant uses available tools effectively
Irrelevant informationLack of focus on user goalsTrain the assistant to stay focused on the specific user request and avoid tangential information
No progress on multi-step tasksPoor task breakdownImplement better task decomposition and ensure the assistant can handle complex, multi-step processes

Best practices for optimization

  • Clear goal identification: Ensure your assistant can identify and prioritize user goals
  • Progressive disclosure: Break complex tasks into manageable steps
  • Tool integration: Make sure the assistant effectively uses available tools and APIs
  • Context awareness: Maintain conversation context to build on previous interactions

Comparison to other metrics

PropertyAction AdvancementInstruction AdherenceCompleteness
Metric CategoryAgentic MetricsResponse QualityResponse Quality
Use this metric forEvaluating goal progress in conversationsMeasuring how well responses follow instructionsAssessing response completeness
Best forMulti-step tasks and agentic workflowsSingle-turn instruction followingEnsuring comprehensive responses
LLM/Luna SupportYesYesYes
Protect Runtime ProtectionNoNoNo
Value TypePercentage (0.0-1.0)Percentage (0.0-1.0)Percentage (0.0-1.0)
LimitationsRequires conversation contextMay not capture goal progressDoesn’t measure goal advancement

Best practices

To effectively implement and optimize Action Advancement in your AI systems, consider these key practices:

Track progress over time

Monitor Action Advancement scores across different versions of your agent to ensure improvements in task completion capabilities. This helps you identify whether your optimizations are actually improving goal advancement.

Analyze failure patterns

When Action Advancement scores are low, examine the specific steps where agents fail to make progress to identify systematic issues. Look for patterns in where agents get stuck or fail to advance user goals.

Combine with other metrics

Use Action Advancement alongside other agentic metrics to get a comprehensive view of your assistant’s effectiveness. This provides a more complete picture of your agent’s performance beyond just goal advancement.

Test edge cases

Create evaluation datasets that include complex, multi-step tasks to thoroughly assess your agent’s ability to advance user goals. This ensures your agent can handle challenging scenarios that require multiple steps.
When optimizing for Action Advancement, ensure you’re not sacrificing other important aspects like safety, factual accuracy, or user experience in pursuit of task completion.
If you would like to dive deeper or start implementing Action Advancement, check out the following resources:

How-to guides