Learn how to measure the correctness and coherence of an agentic trajectory by validating it against user-specified natural language tests
Create a new LLM-as-a-judge metric
Setting | Value |
---|---|
Name | Agent flow |
LLM Model | Select your preferred model |
Apply to | Session |
Advanced Settings | Configure these as required for your needs |
Set the prompt
Customize the prompt by adding your user-defined tests
{{ Add your tests here }}
with a numbered list of tests in natural language that can be used to evaluate the agent efficiency. This can include:list_by_target_muscle_for_exercised
, list_by_body_part_for_exercised
, list_of_bodyparts_for_exercised
. Some user tests might be:Save the metric