Welcome to Galileo! This quickstart guide will walk you through setting up your first AI evaluation in minutes. You’ll learn how to identify and fix common issues in AI responses using Galileo’s powerful metrics and insights.

What You’ll Learn

  • Set up and run an AI evaluation with Galileo in less than 5 minutes
  • Use the Galileo SDK together with the Galileo Console UI
  • Interpret key metrics to identify response quality issues
  • Apply prompt engineering techniques to fix common AI response problems
  • Understand how Galileo helps you build more reliable AI applications

Prerequisites

This example uses OpenAI as the LLM being evaluated, and for generating metrics. Running the entire quickstart, including generating a metric should use approximately 1,500 tokens.

Galileo is model-agnostic, and supports leading LLM providers including OpenAI, Azure OpenAI, Anthropic, and LLaMA.

Get Started with Galileo

Throughout this guide, you will use the Galileo SDK to create your own application, and the Galileo Console to access logs and settings.

The first step is to prepare your application files and development environment.

1

Create Project Folder

Create a new project folder and navigate to it in your terminal:

mkdir first-AI-evaluation
cd first-AI-evaluation
2

Install Dependencies

Install the Galileo SDK with OpenAI support and the dotenv package using the following command in your terminal:

pip install "galileo[openai]" python-dotenv
3

Create Project Files

In your project folder, create a new blank application file and .env file:

first-AI-evaluation/
│── app.py
│── .env
4

Set Up Environment Variables

In your .env file, add the following credentials:

GALILEO_API_KEY=your-Galileo-api-key-here
OPENAI_API_KEY=your-OpenAI-api-key-here
GALILEO_PROJECT=My first project
GALILEO_LOG_STREAM=physics-questions
5

Log in to the Galileo Console

To access your Galileo API keys, open the Galileo Console and log in or create an account.

The Galileo Console is the hub for viewing all your Galileo projects, configuring your experiments, and viewing your logged data.

6

Fill in API Keys

In the .env file, fill in the API key values:

  • OPENAI_API_KEY : Obtain your OpenAI API key by logging in to your OpenAI account on their website.
  • GALILEO_API_KEY : Obtain your Galileo API key from the Galileo Console. Open “Settings & Users” from the drop down menu in the top-right corner. Then, click “API Keys” in the left column, and generate your API key by clicking ”+ Create new key”.
7

Application Code

Copy the code below into your application file:

import os
from galileo import galileo_context # The Galileo context manager
from galileo.openai import openai # The Galileo OpenAI client wrapper
from dotenv import load_dotenv
load_dotenv()

# Initialize the OpenAI client using the Galileo wrapper and your
# OpenAI API key
client = openai.OpenAI(api_key=os.environ.get("OPENAI_API_KEY"))

# The Galileo context manager will automatically create a trace and flush
# the logs when exiting
with galileo_context():
    prompt = f"Explain the following topic succinctly: Newton's First Law"
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": prompt}],
    )
    print(response.choices[0].message.content.strip())

If you are using TypeScript, you will also need to configure your code to use ESM. Add the following to your package.json file:

{
  "type": "module",
  ... // Existing contents
}
8

Run the Application

Run your application using the following command in your terminal:

python app.py
9

View Results in Terminal

Congratulations! You have run your first AI inference using the Galileo SDK.

You can see the model’s output in your terminal. Next, you’ll use the Galileo Console to see even more details.

The Model's Response

Newton’s First Law, often referred to as the Law of Inertia, states that an object will remain at rest, or in uniform motion in a straight line, unless acted upon by a net external force. This means that if an object is not influenced by any external forces, it will maintain its current state of motion. Essentially, this law emphasizes the concept of inertia, which is the natural tendency of objects to resist changes in their motion. It forms the foundation for classical mechanics, outlining the behavior of objects when forces are not in play.

10

Select Project in Console

To see the results of your inference in the Galileo Console, you will need to open your project, and then select your log stream.

Use the drop down in the top-left to select your project, “My first project”.

11

Open Log Streams

Click “Log Streams” in the top menu to open your project’s Log Streams.

Log Streams are where you can see the activity logged from running this tutorial application.

12

View Results in Console

The first row in your Log Stream is the run Trace from running your application. Click it to view more details.

In the Trace, you can see the prompt and the model’s response. However, there is no data about the model’s performance. In the next section we will add a metric to measure how well the model followed our instructions.

Measure instruction adherence

To more accurately measure our model’s performance, you can add the Metric to your logged Traces. This metric measures whether a model followed or adhered to the system or prompt instructions when generating a response. You can read more about this metric in our instruction adherence metric documentation.

1

Configure Metrics

Click “Configure Metrics” on the right to open the menu of built-in Metrics.

Metrics are used to evaluate and track the performance of AI models in your experiments.

2

Add OpenAI Key to Console

Metrics are evaluated using an LLM evaluator, so you will need to configure an LLM to evaluate the instruction adherence.

Click the “OpenAI” link, add your OpenAI API key, and save.

You can also connect to any supported LLM using the integrations page in the Galileo console.

3

Add Instruction Adherence Metric

Click the toggle next to “Instruction Adherence” to add the Metric.

The instruction adherence Metric measures how closely the model’s output matches the instructions in the prompt.

4

Re-run the Application

With the Instruction Adherence Metric set to be logged, run your application again with the same command as before.

python app.py
5

View Instruction Adherence Analysis

The new row in your Log Stream is the run Trace from re-running your application. Click it to view more details.

You can see the instruction adherence measurement in the top-right corner of the Trace.

Your application is now connected to Galileo with Metrics 🎉! You can continue to explore on your own or read more about improving the prompt.

Fix Prompt Issues

If you examine the results we got for our first run, you’ll see that the model’s response is not exactly what we asked for.

What Happened?

  • We asked for a succinct explanation.
  • The model gave a detailed answer instead. 😢
  • Our instruction adherence metric was very low, meaning we need to tweak our prompt.

To understand why our instruction adherence metric was so low, we can look at the metric explanation. You can find this explanation by clicking the LLM node in the Trace steps on the left, and then mousing over the instruction adherence percent output.

Metric Explanation

The instruction provided was to ‘Explain the following topic succinctly: Newton’s first law’. The response begins by defining Newton’s First Law and provides a clear explanation of the concept of inertia. However, the response is lengthy and provides more detail than the word ‘succinctly’ implies. While it does effectively cover the essence of the topic, it could be more concise to align better with the instruction. Thus, while informative, the response does not fully adhere to the request for a succinct explanation.

This explanation correctly points out that the answer we got wasn’t exactly succinct. So, let’s modify our prompt to fix this. We’ll make sure to explain what succinctness means for us.

To do this, replace the prompt in your application file with a new prompt:

# app.py

prompt = """
1. Explain Newton's First Law in one sentence of no more than fifteen (15) words.
2. Do not add any additional sentences, examples, parentheses, bullet points, or
    further clarifications.
3. Your answer must be exactly one sentence and must not exceed 15 words.
"""

Running this again, our results will look much more concise:

Now, our instruction adherence metric jumps to 1! 🎉

What’s Next

Now that you’ve completed your first evaluation, explore these resources to build better AI applications:

Continue your journey with our comprehensive How-to Guides.