Log Stream Metrics

Once you have traces feeding in to a Log stream, you can configure the metrics that you want to evaluate. Metrics are managed at organizational level, including the creation of custom metrics, then are used to evaluate traces at the Log stream level.

Configure metrics for a Log stream

Configure metrics through the console

To configure metrics, open your Log stream and select the Configure Metrics button.

You will need at least one session in your Log stream to be able to configure metrics.

The configure metrics button on the sessions tab

This will load the Configure metrics pane.

The configure metrics pane with the action advancement metric turned on and the switch highlighted, and the save and close button highlighted

From here you can filter and search for metrics, then turn on the relevant ones for your Log stream. Once you have the metrics you need turned on, select the Save and close button to save your settings. You can also create new custom metrics from this pane, either using an LLM as a judge, or in code, then add them to your Log stream.

Configure metrics in code

You can also configure metrics for a Log stream using the Galileo SDKs.

from galileo import GalileoMetrics
from galileo.log_streams import enable_metrics

# Enable metrics
enable_metrics(project_name="MyProject",
               log_stream_name="MyLogStream",
               metrics=[GalileoMetrics.context_adherence])

Set MyProject to your project name, and MyLogStream to your Log stream name. You can then pass in either the relevant metric enum, or the name of a custom metric. This function will enable just the metrics specified for the Log stream. If you have any other metrics enabled before calling this function, they will be disabled.

Metric sampling

Every evaluation interacts with an LLM (unless you are only using custom code-based metrics), and therefore has an associated cost. When your application is in development you will probably want to evaluate every trace that is captured, but once your application is in production and is scaling to hundreds, thousands, or even millions of users you most likely want to reduce your evaluation costs by only evaluating a small sample of the traces that are captured. You can configure metric sampling at a Log stream level. To configure metric sampling rate rules, select the Metric Sampling button from the Configure metrics pane.

From here you can configure the metric sampling rates. These rates can be applied to all metrics (including custom code metrics and Luna-2 metrics), or LLM-as-a-judge metrics only. Set the sampling rate you want, then select the Save button.

When you configure the sample rates, all traces are captured and visible in Galileo, but metrics will only be evaluated for those traces based off the sample rates.For example, if you set the sampling to 10% and create 100 traces, then all 100 traces will be visible in Galileo, with metrics evaluated for just 10 of them.

Metric sampling rates

The most basic way to set sampling rates is by a percentage for all incoming logs. When you set a percentage, all traces are stored and available in Galileo, but only that percentage of traces will be evaluated. A trace is either evaluated for all configured metrics, or not evaluated. You can configure sampling at a more granular level by adding additional rules based off metadata set at a trace level. For example, if you are onboarding a new customer and want to evaluate all of their logs during the onboarding process, you can add the customer name to your metadata, and set a rule to evaluate 100% of traces that have that customer name in their metadata.

The metric sampling dialog showing 100% sampling if customer is set to important customer, otherwise 10%

This metadata is set when you start a trace with the Galileo logger.

logger.start_trace(
    name="Conversation step", 
    input=user_input,
    metadata={"customer": "ImportantCustomer"}
  )

These rules are applied in a top-down approach, so the first rule is evaluated and if the metadata matches, then the percentage is used, if not the next rule is evaluated, and so on. Finally if no rules match, the default sampling rate for all traces is used.

Metric filters

Sometimes metrics only make sense for certain spans. For example, if you have a custom metric for verifying the final response to a user from a multi-agent system with multiple LLM spans, you might only want to calculate the metric on the final LLM span that summarizes the results from all the agents. You can filter the spans that a metric is calculated for, based off the span name or span metadata. Metric filtering is configured at the project level, with filtering applying to all Log streams in a project. To configure metric filters, select Apply filter from the menu for the metric you want to filter on the Configure metrics pane:

Use the Add Condition button to add a condition based off a span name, or span metadata for the span type that the metric evaluates.

For metadata, set the field, the comparison operator, and the value
For the span name, set the comparison operator and the value

You can set multiple conditions, and these are combined with an And clause, so condition 1 And condition 2.

The apply filter dialog with a metadata filter for agent is equal to summary agent

Next steps

Metrics Overview

Explore Galileo’s comprehensive metrics framework for evaluating and improving AI system performance across multiple dimensions.

Custom LLM-as-a-Judge Metrics

Learn how to create evaluation metrics using LLMs to judge the quality of responses.

Custom Code-Based Metrics

Learn how to create, register, and use custom metrics to evaluate your LLM applications.

Overview

Get Started

Logging and Monitoring

Experiments

Runtime Protection

Metrics

Annotations

Integrations

Security

References

Log Stream Metrics

Configure metrics for a Log stream

Configure metrics through the console

Configure metrics in code

Metric sampling

Metric sampling rates

Metric filters

Next steps

Metrics Overview

Custom LLM-as-a-Judge Metrics

Custom Code-Based Metrics

Overview

Get Started

Logging and Monitoring

Experiments

Runtime Protection

Metrics

Annotations

Integrations

Security

References

​Configure metrics for a Log stream

​Configure metrics through the console

​Configure metrics in code

​Metric sampling

​Metric sampling rates

​Metric filters

​Next steps

Metrics Overview

Custom LLM-as-a-Judge Metrics

Custom Code-Based Metrics

Configure metrics for a Log stream

Configure metrics through the console

Configure metrics in code

Metric sampling

Metric sampling rates

Metric filters

Next steps