What are composite metrics?
A composite metric is a custom metric that has access to other metrics computed on the current step or any of its child steps. This allows you to:- Combine multiple metric scores into a single comprehensive evaluation
- Apply conditional logic based on metric values
- Create hierarchical evaluations that aggregate scores across sessions, traces, and spans
- Build context-aware metrics that only calculate when certain conditions are met
required_metrics parameter to specify which metrics
they depend on. These required metrics are guaranteed to be computed before the
composite metric runs, and their values are accessible via the step_object.metrics
dictionary.
Common use cases
Conditional evaluation
Calculate a metric only when another metric meets certain criteria: Example: Only calculate adherence if the input prompt is correct Required metrics:GalileoMetrics.correctness, GalileoMetrics.context_adherence
Hierarchical aggregation
Aggregate metric values across different levels of your application hierarchy: Example: Calculate average metric scores across all spans in a session Required metrics:GalileoMetrics.context_adherence
Multi-metric analysis
Combine multiple metrics to detect specific patterns or issues: Example: Check for PII and count occurrences if found Required metrics:GalileoMetrics.output_pii
Cross-span evaluation
Evaluate metrics across different span types in a trace: Example: Combine retriever and LLM metrics for RAG evaluation Required metrics:GalileoMetrics.context_relevance, GalileoMetrics.context_adherence
Specifying required metrics
Therequired_metrics parameter tells Galileo which metrics must be computed
before your composite metric runs. This ensures the metric values are available
when your scorer function executes.
You specify required metrics when creating your code-based custom metric:
- In the UI: Select metrics from the “Required Metrics” dropdown (see how)
- In the Python SDK: Pass the
required_metricsparameter
Galileo preset metrics
For Galileo’s built-in metrics, use theGalileoMetrics enum. For example, you
might select:
GalileoMetrics.context_adherenceGalileoMetrics.context_adherence_lunaGalileoMetrics.correctness
Custom metrics
For your own custom metrics, reference them by name as strings. You can also mix custom metrics with Galileo preset metrics:"My Custom Metric"(string for custom metric)"Compliance Check"(string for custom metric)GalileoMetrics.output_pii(Galileo preset metric)
Accessing metric values
Once you’ve specified required metrics, access them through thestep_object.metrics dictionary:
Complete example: multi-level session metric
This example demonstrates a comprehensive composite metric that aggregates scores from all hierarchy levels. Required metrics to select (in UI dropdown or SDK parameter):GalileoMetrics.conversation_qualityGalileoMetrics.action_completionGalileoMetrics.agent_efficiencyGalileoMetrics.action_completion_lunaGalileoMetrics.action_advancementGalileoMetrics.context_adherenceGalileoMetrics.context_relevanceGalileoMetrics.tool_error_rate
Best practices
Be specific with required metrics
Only include metrics you actually use. This improves performance and makes your metric’s dependencies clear:Use appropriate step types
Match your composite metric’s step type to where the required metrics exist:- Session: Can access session, trace, and span metrics
- Trace: Can access trace and span metrics
- Span: Can only access metrics on that specific span
Execution restrictions
Composite metrics depend on the successful completion of theirrequired metrics:
- While any required metric is not yet final (e.g., queued or computing), the composite metric remains queued.
- If any required metric finishes without a successful final status (e.g., failed, not computed, or not applicable), the composite metric raises an error that includes the failed statuses of those required metrics.
- Metrics not listed in
required_metricsdo not affect the composite metric—only the required ones gate execution.
Creating composite metrics
Composite metrics can be created in two ways:- Galileo Console UI: Use the custom code-based metrics editor and select required metrics from the “Required Metrics” dropdown
- Python SDK: Add the
required_metricsparameter when creating code-based metrics
Composite metrics are only supported for code-based custom metrics.
LLM-as-a-judge metrics do not support the
required_metrics parameter.Create composite metrics in the UI
Learn how to create composite metrics using the Galileo Console
Python SDK reference
View Python SDK documentation for metrics
Custom metrics overview
Learn about custom code-based metrics in Galileo