Galileo’s Annotations feature helps teams label, categorize, and analyze model outputs more effectively. Annotations provide a structured way to add human-in-the-loop feedback, highlight model behavior patterns, and guide future improvements to datasets and models.

Overview: What Are Annotations?

Annotations are tags and metadata that can be used to label predictions, inputs, or other artifacts during model development and evaluation. They help organize, classify, and track model behavior during experimentation and production.

You can see the tags and metadata in the Log Streams section of the Galileo Console.

These labels can represent:

  • Model issues (e.g., label_mismatch, low_confidence)
  • Observations from human reviewers (e.g., grammar_error, ambiguous_input)
  • Dataset quality notes (e.g., duplicate_sample, out_of_distribution)
  • Business-specific labels (e.g., escalation_required, VIP_customer)

Why Use Annotations?

Annotations are particularly useful for:

  • Team collaboration: Share insights across teams with consistent tagging systems.
  • Tracking model behavior: Label samples where the model fails or performs below expectations.
  • Data quality analysis: Flag problematic or noisy inputs for later review or exclusion.
  • Model evaluation: Filter and analyze annotated data to evaluate model performance on specific slices.
  • Automated improvement: Create feedback loops between model outputs and annotation-guided dataset refinement.

Tags & Metadata

Annotations are added to logged Traces and Spans in the form of tags and metadata.

Tags - short, flat labels (strings) you assign to a trace or span to make them easy to group and filter.

  • Unlimited number of tags per trace/span (practical limit ≈ 50)
  • Case‑sensitive strings ≤ 50 chars each
  • Ideal for boolean‑style filters (e.g., “show me all traces tagged physics”)

Metadata - key‑value dictionaries that travel with a trace or a span, perfect for structured information like user IDs, experiment hashes, timestamps, or numeric metrics.

  • Keys and values are strings ≤ 256 chars
  • Appears in Log Streams as new columns
  • Ideal for structured attributes you can filter, group, and aggregate

Adding Annotations to Traces

To add annotations to your Traces, initialize the Galileo Logger, and then include tags and metadata when you start your Trace.

# Include GalileoLogger in imports
from galileo import GalileoLogger

# Initialize logger
logger = GalileoLogger()

# Initialize a new Trace with tags and metadata
trace = logger.start_trace(
    input="Explain the following topic succinctly: Newton's First Law",
    tags=["newton", "test", "new-version"],
    metadata={"experimentNumber": "1",
              "promptVersion": "0.0.1",
              "field": "physics"}
)

Adding Annotations to Spans

Attach annotations to your Spans by including tags and metadata when you add your Span.

# Include tags and metadata when adding a new Span
logger.add_llm_span(
    input=[{"role": "system", "content": prompt}],
    output="Model response",
    model="gpt-4o",
    tags=["newton", "test", "new-version"],
    metadata={"experimentNumber": "1",
              "promptVersion": "0.0.1",
              "field": "physics"}
)

Annotations in the Galileo Console

Tags and metadata appear within the selected Project and Log Stream in the Galileo Console.

You can view the tags and metadata attached to Traces and Spans in their “Parameters” section.

Additionally, metadata appears as new columns in the Log Stream.

Best Practices

  • Tag early. Start the trace yourself and pass tags before the first LLM call; otherwise auto‑spans may end up in an untagged trace.
  • Keep tags coarse‑grained. A handful of well‑chosen tags beat hundreds of one‑offs.
  • Standardise metadata keys. Stick to a naming convention (experiment, user_id, etc.) so dashboards stay tidy.
  • Avoid sensitive data. Never put PII or keys in either tags or metadata; they become queryable in the UI.

Next Steps:

Get started using Annotations with the Adding Annotations step-by-step guide.