Overview

In this tutorial, you’ll learn how to add evaluations with Galileo to an existing multi-agent LangGraph app. This tutorial is intended for Python LangGraph developers who already have an app and are looking to add evaluation. It assumes you have basic knowledge of:

By the end of this tutorial, you’ll be able to:

  • Add Galileo evaluations to a multi-agent LangGraph app
  • View and understand session level metrics

You can also watch the video walkthrough here.

Background

This tutorial uses an existing banking chatbot app powered by Chainlit and LangGraph. This is a very simplistic example of a chatbot for a fictitious bank. It is a multi-agent app, with a supervisor agent, and a single additional agent that can be used to answer questions on the credit cards offered by the bank. This agent uses some dummy credit card documents stored in a Pinecone vector database.

For example, you can ask questions like “What credit cards do you offer?” or “Which card has the lowest annual fee?”

These are the 2 agents:

  1. Credit card information agent

    This agent provides information on the available credit cards.

    The credit card documentation that the agent uses is stored in a Pinecone vector database.

  2. Supervisor agent

Chainlit provides a web front end for a chatbot, managing user interaction and conversation history. The important files in this app are:

  • app.py - This contains the main application logic for a Chainlit app. It has an on_chat_start function that is called whenever a new chat is started, and a main function that is called whenever a message is sent.
  • src/galileo_langgraph_fsi_agent/agents/supervisor_agent.py - This is a LangGraph supervisor agent that manages the other agents, routing messages where needed. This is configured to use GPT-4.1-mini.
  • src/galileo_langgraph_fsi_agent/agents/credit_card_information_agent.py - This is a LangGraph agent that uses a tool to extract information about the available credit cards from Pinecone. This is also configured to use GPT-4.1-mini.
  • src/galileo_langgraph_fsi_agent/tools/pinecone_retrieval_tool.py - This is a LangGraph tool that interacts with the Pinecone vector database. It is called by the credit_card_information_agent.

Before you start

Before you start the tutorial, you will need:

  1. The starter project - Clone the Galileo SDK-Examples repo. This repo contains both the starting LangGraph app that you will be adding Galileo evaluations to, as well as a final version for reference.
  2. A Pinecone account and API key - If you don’t have an existing Pinecone account, head to pinecone.io, sign up for a free account, and get an API key.
  3. An OpenAI API key - This example uses OpenAI as the underlying LLM to run the agents.
  4. A Galileo API key - To access your Galileo API keys, open the Galileo Console and log in or create an account. From the Settings and Users page you can create a new API key.

Set up the project

The starter project is in the sdk-examples/python/agent/langgraph-fsi-agent/before folder in the cloned repo.

  1. Open the starter project in your Python IDE of choice.

  2. Install the dependencies that are defined in the pyproject.toml. Create a virtual environment, and install these dependencies using a tool such as uv:

    uv venv .venv
    source .venv/bin/activate
    uv sync --dev
  3. Configure your .env file. Copy the .env.example file to .env, and set the values for your OpenAI and Pinecone API keys:

    # AI services
    OPENAI_API_KEY=<Your OpenAI API key>
    PINECONE_API_KEY=<Your Pinecone API key>

    Replace <Your OpenAI API key> with your OpenAI API key. Replace <Your Pinecone API key> with your Pinecone API key.

  4. Upload the dummy credit card documentation to Pinecone using the provided helper script:

    python ./scripts/setup_pinecone.py

    This will take a few seconds and a successful run should look like:

    Loading documents for credit-card-information folder...
    ...
    ✅ Document processing and upload complete!
  5. Run the project to test it out:

    chainlit run app.py -w

    The app will be running at localhost:8000, so open it in your browser.

    Ask the bot questions like “What credit cards do you offer?”.

You are now ready to add Galileo evaluations to your app.

Create a new Galileo project

First you need a new Galileo project to log evaluations to.

  1. Create a new project from the Galileo Console using the + New Project button. Name this project bank-chatbot.

Install the Galileo Python package

To send data to Galileo, you need to use the Galileo Python package.

  1. Install the Galileo Python package in your virtual environment.

    uv add "galileo[openai]"

    This installs the Galileo Python package with the optional OpenAI wrapper.

  2. Add the following Galileo environment variables to you .env file:

    GALILEO_API_KEY=<Your Galileo API key>
    GALILEO_PROJECT=bank-chatbot
    GALILEO_LOG_STREAM=chatbot-logs

    Replace <Your Galileo API key> with your Galileo API key. The project is set to the new project you just created, and the log stream is set to chatbot-logs.

    You don’t need to create the log stream in advance, a new log stream will be created automatically.

Add logging to Galileo

Next you need to add code to log to Galileo. Galileo has a LangGraph callback handler that can be passed into the agent to automatically log traces for every step in the chain, including agent calls, tool calls, and LLM calls.

You can find a complete version of this code with all the code added in the sdk-examples/python/agent/langgraph-fsi-agent/after folder in the cloned repo.

Add the logging code

  1. Add include directives for the Galileo components to the top of the app.py file:

    from galileo import galileo_context
    from galileo.handlers.langchain import GalileoAsyncCallback
  2. Start a Galileo session. In the on_chat_start function in app.py, add the following code to create a new logging session:

    # Start Galileo session with unique session name
    current_time = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
    session_name = f"FSI Agent - {current_time}"
    galileo_context.start_session(name=session_name, 
                                  external_id=cl.context.session.id)

    This creates a new session named “FSI Agent - {time}” with the current date and time. This also sets the external_id to the current Chainlit session ID. Each separate conversation in Chainlit is a separate session with a unique ID.

  3. Create a callback handler. After the code you just added, add the following to create the callback handler, and save it in the Chainlit session:

    # Create the callback. This needs to be created in the same
    # thread as the session so that it uses the same session context.
    galileo_callback = GalileoAsyncCallback()
    cl.user_session.set("galileo_callback", galileo_callback)

    This creates the callback handler, and saves it against the current user session.

    The Galileo logging handlers use the current thread context to connect to the current Galileo context. This means to have a callback handler tied to a session, it needs to be created in the same thread as the session. It can then be access from any other thread.

  4. Pass the callback handler to LangGraph. In the main function, replace this line:

    callbacks: Callbacks = []

    With the following:

    galileo_callback = cl.user_session.get("galileo_callback")
    callbacks: Callbacks = [galileo_callback]

    This will extract the Galileo callback from the user session, and adds it to a callbacks collection. This collection is passed to the LangGraph RunnableConfig that is passed when the supervisor agent is used.

Run the app

  1. Run the app.

    chainlit run app.py -w

    Open the app in your browser at localhost:8000, and ask the bot a question. In your terminal you will see references to the Galileo log stream being created, and traces being flushed:

    🚀 Creating new log stream... log stream chatbot-logs created!
    ...
    Flushing 1 traces...
    Successfully flushed 1 traces.

    Leave the app running whilst you view the traces.

View the traces

  1. View the session in Galileo. Open the Galileo console and select your project. In the Sessions tab you should see a single session created for the conversation.

  2. Select the single session. It will open in the sessions view showing a flowchart

    Select the nodes in this chart to see the input and output.

Add more traces to the session

Sessions can contain multiple traces. For example, a single user conversation with your bot would be a single session, containing multiple traces for the different questions you ask the bot.

  1. Ask the bot a follow up question related to credit cards, such as “Which card has no annual fee?”

  2. Follow this up with a third question that does not involve specific information about the credit cards, such as “What does APR stand for?”

  3. View the session in the Galileo console.

    This session will have 3 traces. Use the Trace navigation to move between the traces. In the Input and Output you will see the relevant messages.

  4. Navigate to the last trace. Where you asked “What does APR stand for?”, the credit card agent would not need to be used, so the flowchart doesn’t show this node.

Summary

In this tutorial, you learned how to:

  • Add Galileo evaluations to a multi-agent LangGraph app
  • View and navigate session level traces

Next steps

Some suggested next steps are: