Learn how to create, register, and use custom metrics to evaluate your LLM applications
Navigate to the Metrics section
Select the Code metric type
Write your custom metric
scorer_fn
and aggregator_fn
functions as described below.Save your metric
scorer_fn
)**kwargs
to ensure forward compatibility. Here’s a complete example that measures the difference in length between the output and ground truth:
index
: Row index in the datasetnode_input
: Input to the nodenode_output
: Output from the nodenode_name
, node_type
, node_id
, tools
: Workflow/chain-specific parametersdataset_variables
: Key-value pairs from the dataset (includes ground truth)aggregator_fn
)float
).
scorer_fn
as a dictionary:
Span
or Trace
containing the LLM input and output, and computes a score. The exact measurement is up to you — for example, you might measure the length of the output or rate it based on the presence/absence of specific words.
str
, the Aggregator will be called with a list[str]
. The Aggregator’s return value can also be any type (e.g., str
, bool
, int
), depending on how you want to represent the final metric.
LocalMetricConfig[type]
A typed callable provided by Galileo’s Python SDK that combines your Scorer and Aggregator into a custom metric.
type
should match the type returned by your Aggregator.bool
values, you would use LocalMetricConfig[bool](…)
, and your Aggregator must accept a list[bool]
and return a bool
.LocalMetricConfig
, running the experiment is as simple as calling run_experiment
. The results appear alongside Galileo’s built-in metrics, so you can compare, visualize, and analyze everything in one place.
With Local Metrics, you have full control over how you measure LLM behavior—unlocking deeper insights and more targeted evaluations for your AI applications.
Feature | Registered Custom Metrics | Local Metrics |
---|---|---|
Creation | Python client, activated via UI | Python client only |
Sharing | Organization-wide | Current project only |
Environment | Server-side | Local Python environment |
Libraries | Limited to Galileo environment | Any available library |
Resources | Restricted by Galileo | Local resources |