Overview
In this tutorial, you’ll learn how to build a Retrieval-Augmented Generation (RAG) application that combines:- Elasticsearch for document storage and semantic search using the ELSER model
- LangGraph for building conversational agents
- Galileo for end-to-end observability and logging
What you’ll build
You’ll create a RAG chatbot that:- Use ELSER model for semantic search
- Stores documents in the Elasticsearch vector store
- Uses LangGraph to orchestrate retrieval and generation steps
- Monitor traces with Galileo
Prerequisites
Before starting, you’ll need:- Python 3.10+ installed
- An Elasticsearch instance (we’ll use Elastic Cloud Serverless)
- OpenAI API key for the language model
- Galileo account for observability
Step 1: set up Elasticsearch cloud serverless
First, let’s set up your Elasticsearch instance for document storage and retrieval.Create your Elasticsearch project
- Navigate to cloud.elastic.co and create an account or log in
- Click Create serverless project
- Choose Elasticsearch as the project type
- Select Optimized for Vectors configuration
- Name your project (e.g., “rag-chatbot”) and click Create project
Create your first index
- Once your project is ready, you’ll see the index creation page
- Enter an index name:
demo - Click Create my index
- Important: Copy and save your Elasticsearch URL and API key - you won’t see the API key again
Deploy or configure the ELSER model
ELSER (Elastic Learned Sparse EncodeR) provides semantic search capabilities:- In your Elasticsearch project, go to Relevance → Inference Endpoints
- If ELSER does not exist, click Create endpoint
- Follow the ELSER docs or Elastic guide
- Note the model ID (typically
.elser_model_2_linux-x86_64)
Step 2: set up your Python environment
Create a new project and install the required dependencies in a virtual environment:Step 3: configure environment variables
Create a.env file or set these environment variables:
- Linux x86_64:
.elser_model_2_linux-x86_64 - Check your Elasticsearch ML models for the exact name
Step 4: build the RAG application
Now, let’s build the RAG application step-by-step. Create a Python file (e.g.,demo.py) and add the following code snippets.
Imports and configuration
First, we import the necessary libraries and configure our environment variables. This part of the script loads your API keys and sets up the connection details for Elasticsearch, OpenAI, and Galileo.Python
1. Elasticsearch setup
Python
- Connects to your Elasticsearch instance
- Connects to the ELSER model for semantic search
- Creates an index and stores sample documents
2. agent architecture
Python
- State Management: Uses
AgentStateto track conversation messages - Tool Integration: Creates a retriever tool that searches Elasticsearch
- LangGraph Workflow: Defines the flow between agent reasoning and tool usage
3. conversation flow
Python
- User asks a question
- Agent decides whether to use the retriever tool
- If needed, searches Elasticsearch for relevant documents
- Generates a response based on retrieved context
- Saves the conversation to chat history
ask_question function with a sample query.
Python
Step 4: run the application
To run your RAG application, save all the code into a singledemo.py file and execute it from your terminal:
5. adding Galileo observability
- Open your Galileo Console
- Navigate to your project (f.e.
elasticsearch-rag-demo) - You’ll see traces for each question, showing:
- Document retrieval steps
- LLM generation
- Full conversation context
- Performance metrics
- Connect to Elasticsearch and verify the connection
- Use the ELSER model
- Index sample documents about company policies
- Create the RAG agent with LangGraph workflow
- Run sample questions and display answers and log them to Galileo
Troubleshooting
Connection issues
- Verify your Elasticsearch host URL and API key
- Ensure your IP is whitelisted if using Elastic Cloud if not using serverless
- Check that Elasticsearch is running and accessible
ELSER model issues
- Verify the model name matches your platform
- Ensure machine learning features are enabled
- Check that you have sufficient resources for model deployment
Missing documents in search
- Wait a few seconds after indexing for documents to be available
- Verify the index name matches your configuration
- Check Elasticsearch logs for indexing errors