Datasets allow you to store and reuse well-defined data for use in experiments. Datasets can be stored and versioned in Galileo, and available for experiments running both in the console as well as in code.

Dataset fields

Each record in a Galileo dataset can have three top-level fields:

  1. input - Input variables that can be passed to your application to recreate a test case.
  2. output - Reference outputs to evaluate your application. These can be the ground truth for BLEU, ROUGE, and Ground Truth Adherence metrics, or reference outputs for manual reference.
  3. metadata - Additional data you can use to filter or group your dataset.

Create Datasets

When you create a dataset, it is uploaded to Galileo and available to future experiments. Datasets need to have unique names, and are available to all projects across your organization.

from galileo.datasets import create_dataset

# Create a dataset with test data
test_data = [
    {
        "input": "Which continent is Spain in?",
        "output": "Europe",
    },
    {
        "input": "Which continent is Japan in?",
        "output": "Asia",
    },
]

dataset = create_dataset(
    name="countries",
    content=test_data
)

See the create_dataset Python SDK docs or createDataset TypeScript SDK docs for more details.

Get Existing Datasets

Once a dataset has been created in Galileo, you can retrieve it to use in your experiments by name or ID.

from galileo.datasets import get_dataset

# Get a dataset by name
dataset = get_dataset(
    name="countries"
)

# Get a dataset by ID
dataset = get_dataset(
    id="dataset-id"
)

# Get its content
dataset.get_content()

See the get_dataset Python SDK docs or getDataset TypeScript SDK docs for more details.

Add Rows to Existing Datasets

from galileo.datasets import get_dataset

# Get an existing dataset
dataset = get_dataset(
    name="countries"
)

# Add new rows to the dataset
dataset.add_rows([
    {
        "input": "Which continent is Morocco in?",
        "output": "Africa",
    },
    {
        "input": "Which continent is Australia in?",
        "output": "Oceania",
    },
])

See the add_rows Python SDK docs for more details.

List Datasets

You can retrieve all the datasets for a project.

from galileo.datasets import list_datasets

# List all datasets in a project
datasets = list_datasets()

# List datasets with a custom limit
datasets = list_datasets(
    limit=50,
)

See the list_datasets Python SDK docs or getDatasets TypeScript SDK docs for more details.

Delete Datasets

If a dataset is no longer needed, you can delete it by name or ID.

from galileo.datasets import delete_dataset

# Delete a dataset by name
delete_dataset(name="countries")

# Delete a dataset by ID
delete_dataset(id="dataset-id")

See the delete_dataset Python SDK docs or deleteDataset TypeScript SDK docs for more details.

Work with Dataset Versions

Galileo automatically creates new versions of datasets when they are modified. You can access different versions by getting the dataset history.

from galileo.datasets import get_dataset_version_history

# Get the version history
dataset = get_dataset_version_history(
    dataset_name="countries"
)

# List out the rows added with each version
for dataset in datasets.versions:
    print(f"""
    Version index: {dataset.version_index},
    rows added: {dataset.rows_added}
    """)

See the get_dataset_version_history Python SDK docs for more details.

Use Datasets in Experiments

Datasets are primarily used for running experiments to evaluate the performance of your LLM applications:

from galileo.datasets import get_dataset
from galileo.experiments import run_experiment
from galileo.prompts import get_prompt_template
from galileo.schema.metrics import GalileoScorers

# Get an existing dataset
dataset = get_dataset(
    name="countries"
)

# Get an existing prompt template
prompt_template = get_prompt_template(
    project="my-project",
    name="geography-prompt"
)

# Run an experiment with the dataset and prompt
results = run_experiment(
    "geography-experiment",
    dataset=dataset,
    prompt_template=prompt_template,
    metrics=[GalileoScorers.Completeness],
    project="my-project",
)

Datasets allow you to store and reuse well-defined data for use in experiments. Datasets can be stored and versioned in Galileo, and available for experiments running both in the console as well as in code.

Dataset fields

Each record in a Galileo dataset can have three top-level fields:

  1. input - Input variables that can be passed to your application to recreate a test case.
  2. output - Reference outputs to evaluate your application. These can be the ground truth for BLEU, ROUGE, and Ground Truth Adherence metrics, or reference outputs for manual reference.
  3. metadata - Additional data you can use to filter or group your dataset.

Create Datasets

When you create a dataset, it is uploaded to Galileo and available to future experiments. Datasets need to have unique names, and are available to all projects across your organization.

from galileo.datasets import create_dataset

# Create a dataset with test data
test_data = [
    {
        "input": "Which continent is Spain in?",
        "output": "Europe",
    },
    {
        "input": "Which continent is Japan in?",
        "output": "Asia",
    },
]

dataset = create_dataset(
    name="countries",
    content=test_data
)

See the create_dataset Python SDK docs or createDataset TypeScript SDK docs for more details.

Get Existing Datasets

Once a dataset has been created in Galileo, you can retrieve it to use in your experiments by name or ID.

from galileo.datasets import get_dataset

# Get a dataset by name
dataset = get_dataset(
    name="countries"
)

# Get a dataset by ID
dataset = get_dataset(
    id="dataset-id"
)

# Get its content
dataset.get_content()

See the get_dataset Python SDK docs or getDataset TypeScript SDK docs for more details.

Add Rows to Existing Datasets

from galileo.datasets import get_dataset

# Get an existing dataset
dataset = get_dataset(
    name="countries"
)

# Add new rows to the dataset
dataset.add_rows([
    {
        "input": "Which continent is Morocco in?",
        "output": "Africa",
    },
    {
        "input": "Which continent is Australia in?",
        "output": "Oceania",
    },
])

See the add_rows Python SDK docs for more details.

List Datasets

You can retrieve all the datasets for a project.

from galileo.datasets import list_datasets

# List all datasets in a project
datasets = list_datasets()

# List datasets with a custom limit
datasets = list_datasets(
    limit=50,
)

See the list_datasets Python SDK docs or getDatasets TypeScript SDK docs for more details.

Delete Datasets

If a dataset is no longer needed, you can delete it by name or ID.

from galileo.datasets import delete_dataset

# Delete a dataset by name
delete_dataset(name="countries")

# Delete a dataset by ID
delete_dataset(id="dataset-id")

See the delete_dataset Python SDK docs or deleteDataset TypeScript SDK docs for more details.

Work with Dataset Versions

Galileo automatically creates new versions of datasets when they are modified. You can access different versions by getting the dataset history.

from galileo.datasets import get_dataset_version_history

# Get the version history
dataset = get_dataset_version_history(
    dataset_name="countries"
)

# List out the rows added with each version
for dataset in datasets.versions:
    print(f"""
    Version index: {dataset.version_index},
    rows added: {dataset.rows_added}
    """)

See the get_dataset_version_history Python SDK docs for more details.

Use Datasets in Experiments

Datasets are primarily used for running experiments to evaluate the performance of your LLM applications:

from galileo.datasets import get_dataset
from galileo.experiments import run_experiment
from galileo.prompts import get_prompt_template
from galileo.schema.metrics import GalileoScorers

# Get an existing dataset
dataset = get_dataset(
    name="countries"
)

# Get an existing prompt template
prompt_template = get_prompt_template(
    project="my-project",
    name="geography-prompt"
)

# Run an experiment with the dataset and prompt
results = run_experiment(
    "geography-experiment",
    dataset=dataset,
    prompt_template=prompt_template,
    metrics=[GalileoScorers.Completeness],
    project="my-project",
)