Run an Experiment

Experiments allow you to evaluate prompts, models, and your application code, using well-defined inputs, against metrics of your choice.

Configure an LLM integration

To run an experiment using prompts and a dataset, you need to set up an LLM integration. This can be used both to run the prompt using your dataset, and to evaluate the response against a metric.

Select the user menu

Select the user menu in the bottom left.

Open the integrations page

Navigate to the LLM integrations page. Select Integrations from the user menu.

Add an integration

Locate the option for the LLM platform you are using, then select the +Add Integration button.

Add the settings

Set the relevant settings for your integration, such as your API keys or endpoints. Then select Save.

Run your experiment with a prompt template and dataset

Install dependencies

Install the Galileo SDK, and the dotenv package using the following command in your terminal:

pip install galileo python-dotenv

Set up your environment variables

Create an .env file in your project folder, and set:

Your Galileo API key
Your Galileo project name

GALILEO_API_KEY="your-galileo-api-key"
GALILEO_PROJECT="your-galileo-project-name"
# Provide the console url below if you are not using app.galileo.ai
# GALILEO_CONSOLE_URL="your-galileo-console-url"

Create your application code

Create a file called app.py (Python) or app.ts (TypeScript) and add the following code:

import os

from galileo import GalileoScorers, Message, MessageRole
from galileo.config import GalileoPythonConfig
from galileo.datasets import create_dataset, get_dataset
from galileo.experiments import run_experiment
from galileo.prompts import create_prompt, get_prompt
from galileo.resources.models.prompt_run_settings import PromptRunSettings

# Load the environment variables
from dotenv import load_dotenv
load_dotenv()

# Create a prompt template, or load it if it already exists
prompt = get_prompt(name="My Prompt")
if not prompt:
    prompt = create_prompt(
        name="My Prompt",
        template=[
            Message(
                role=MessageRole.system,
                content="""
Galileo is the fastest way to ship reliable apps.
Galileo brings automation and insight to AI evaluations so you can
ship with confidence.
""",
            ),
            Message(role=MessageRole.user, content="{{input}}"),
        ],
    )

# Create a dataset, or load it if it already exists
dataset = get_dataset(name="My Dataset")
if not dataset:
    dataset = create_dataset(
        "My Dataset",
        content=[
            {"input": "What is Galileo?"},
            {"input": "What is Copernicus?"}
        ],
    )

# Run the experiment
experiment = run_experiment(
    experiment_name="My Experiment",
    prompt_template=prompt,
    dataset=dataset,
    metrics=[GalileoScorers.context_adherence],
    project=os.environ.get("GALILEO_PROJECT"),
    prompt_settings=PromptRunSettings(model_alias="GPT-4o"),
)

# Show Galileo information
config = GalileoPythonConfig.get()
prompt_url = f"{config.console_url}prompts/{prompt.id}"
dataset_url = f"{config.console_url}datasets/{dataset.dataset.id}"

print()
print("🚀 GALILEO LOG INFORMATION:")
print(f"🔗 Prompt     : {prompt_url}")
print(f"🔗 Dataset    : {dataset_url}")
print(f"🔗 Experiment : {experiment['link']}")

This code defaults to using GPT-4o. If you want to use a different model, update the model_alias in the prompt settings passed to the call to run experiment.

This code creates a prompt containing a system prompt and user prompt, and the user prompt has a mustache template to inject rows from the dataset. It also creates a dataset.It then uses these to run an experiment, measuring context adherence.If the prompt or dataset already exist, they are loaded instead of being recreated.

Run your application

Run your application using the following command in your terminal:

python app.py

View the results in your terminal

Experiment My Experiment has started and is currently processing.
Results will be available at https://app.galileo.ai/project/.../experiments/...

🚀 GALILEO LOG INFORMATION:
🔗 Prompt     : https://app.galileo.ai/prompts/...
🔗 Dataset    : https://app.galileo.ai/datasets/...
🔗 Experiment : https://app.galileo.ai/project/.../experiments/...

See the experiment in Galileo

Open the experiment in the Galileo console using the URL output to your terminal. You will see the logged experiment with 2 rows, one for each entry in the dataset.

Select a trace to see more details, including an explanation of the metric score.

The first trace in the experiment in Galileo with a 0% context adherence

Troubleshooting

I need a Galileo API key: Head to app.galileo.ai and sign up. Then head to the API keys page to get a new API key.
What’s my project name ?: The project name was set when you created a new project. If you haven’t created a new project, head to Galileo and select the New Project button.

Next steps

Create a dataset

Learn how to create and manage datasets in Galileo.

Run experiments in playgrounds

Learn about running experiments in the Galileo console using playgrounds and datasets.

Run experiments with code

Learn how to run experiments in Galileo.

Compare experiments

Learn how to compare experiments in Galileo.

Overview

Get Started

Logging and Monitoring

Experiments

Runtime Protection

Metrics

Annotations

Integrations

Security

References

Run an Experiment

Configure an LLM integration

Run your experiment with a prompt template and dataset

Troubleshooting

Next steps

Create a dataset

Run experiments in playgrounds

Run experiments with code

Compare experiments

Overview

Get Started

Logging and Monitoring

Experiments

Runtime Protection

Metrics

Annotations

Integrations

Security

References

​Configure an LLM integration

​Run your experiment with a prompt template and dataset

​Troubleshooting

​Next steps

Create a dataset

Run experiments in playgrounds

Run experiments with code

Compare experiments

Configure an LLM integration

Run your experiment with a prompt template and dataset

Troubleshooting

Next steps