Experiments in Galileo allow you to evaluate and compare different prompts, models, and configurations using datasets and prompt templates, and measure their performance using various metrics. This helps you identify the best approach for your specific use case.

You can run experiments against dedicated experiment code, for example in a Jupyter notebook for prompt engineering and model evaluation, or use experiments to run your application code under controlled conditions for evaluation-driven development, for example by running your experiments in your CI/CD pipeline.

For a list of supported metrics, see the Metrics Reference Guide.

Logging experiments

Experiments belong to projects, with one project containing many experiments. Each experiment has a single log stream with multiple traces. When you use a dataset with an experiment, each row in the dataset is logged as a separate trace in the experiment’s log stream.

When you are starting to plan your experiments, ensure you have created the relevant project to run them in.

Initial setup

To log experiments to Galileo, you need to configure the SDK to connect to Galileo using an API key and optionally a URL for a custom deployment, as well as setting the project name to log the experiments to.

API key

To get started running experiments with Galileo, you need to configure your API key, and optionally the URL of your Galileo deployment if you are using a custom-hosted, or self-deployed version. These are set as environment variables. In development you can use a .env file for these, for a production deployment make sure you configure these correctly for your deployment platform.

If you are using the free version of Galileo, there is no need to set the GALILEO_CONSOLE_URL environment variable.

Environment variableDescription
GALILEO_API_KEYYour Galileo API key.
GALILEO_CONSOLE_URLFor custom Galileo deployments only, set this to the URL of your Galileo console to log to. If this is not set, it will default to the hosted Galileo version at app.galileo.ai.

Project

The project can be configured as an environment variable, or directly in code.

Environment variableDescription
GALILEO_PROJECTThe Galileo project to log to. If this is not set, you will need to pass the project name in code.

You can also set the project when running the experiment by passing it in to the run experiments call.

results = run_experiment(
    "my-experiment",
    dataset=dataset,
    prompt_template=prompt_template,
    metrics=[GalileoScorers.correctness],
    # Set the project name here
    project="my-project"
)

Next Steps

Experiments SDK

Metrics