Experiments
Run experiments with multiple datapoints
Experiments in Galileo allow you to evaluate and compare different prompts, models, and configurations using datasets. This helps you identify the best approach for your specific use case.
Running an Experiment with a Prompt Template
The simplest way to get started is by using a prompt template.
- If you have an existing prompt template, you can fetch it by importing and using the
get_prompt_template
function fromgalileo.prompts
. - The
get_dataset
function below expects a dataset that you created through either the console or the SDK. Ensure you have saved a dataset before running the experiment!
Running Experiments with Custom Functions
For more complex scenarios, you can use custom functions with the OpenAI wrapper. Here, you may use either a saved dataset or a custom one
Custom Dataset Evaluation
When you need to test specific scenarios:
Custom Metrics for Deep Analysis
For sophisticated evaluation needs:
Best Practices
-
Use consistent datasets: Use the same dataset when comparing different prompts or models to ensure fair comparisons.
-
Test multiple variations: Run experiments with different prompt variations to find the best approach.
-
Use appropriate metrics: Choose metrics that are relevant to your specific use case.
-
Start small: Begin with a small dataset to quickly iterate and refine your approach before scaling up.
-
Document your experiments: Keep track of what you’re testing and why to make it easier to interpret results.
Related Resources
- What are Datasets? - Learn about Datasets and how to work with them
- Creating Datasets - Creating and managing datasets for experiments in Python
- Creating Prompt Templates - Creating and using prompt templates in Python