Basic RAG Example

When implementing RAG systems, it’s crucial to properly handle document retrieval, context management, and response generation. This guide demonstrates a basic RAG implementation using Galileo’s observability features.

What you’ll need

OpenAI API key
Galileo API key
Python environment with required packages
Basic understanding of RAG concepts

Setup instructions

Set Up Your Environment

Create a .env file with your API keys:

# Your Galileo API key
GALILEO_API_KEY="your-galileo-api-key"

# Your Galileo project name
GALILEO_PROJECT="your-galileo-project-name"

# The name of the Log stream you want to use for logging
GALILEO_LOG_STREAM="your-galileo-log-stream"

# Provide the console url below if you are using a
# custom deployment, and not using the free tier, or app.galileo.ai.
# This will look something like “console.galileo.yourcompany.com”.
# GALILEO_CONSOLE_URL="your-galileo-console-url"

# OpenAI properties
OPENAI_API_KEY="your-openai-api-key"

# Optional. The base URL of your OpenAI deployment.
# Leave this commented out if you are using the default OpenAI API.
# OPENAI_BASE_URL="your-openai-base-url-here"

# Optional. Your OpenAI organization.
# OPENAI_ORGANIZATION="your-openai-organization-here"

Install Dependencies

Install required dependencies:

requirements.txt

galileo[openai]
python-dotenv
rich
questionary

Running and Monitoring

Execute the application:

python app.py

Use Galileo to monitor:

Document retrieval performance
- Chunk relevance
- Chunk attribution utilization
- Completeness
System performance metrics

Implementation guide

Let’s break down the implementation into manageable sections:

1. setting up the environment

First, we’ll set up our imports and initialize our environment:

app.py

import os
from dotenv import load_dotenv
from galileo import log
from galileo.openai import openai
from rich.console import Console
from rich.panel import Panel
from rich.markdown import Markdown
import questionary
import sys

load_dotenv()

# Initialize console for rich output
console = Console()

# Check if Galileo logging is enabled
logging_enabled = os.environ.get("GALILEO_API_KEY") is not None

# Initialize OpenAI client directly
client = openai.OpenAI(api_key=os.environ.get("OPENAI_API_KEY"))

This section:

Imports necessary libraries
Loads environment variables
Sets up rich console output
Initializes the OpenAI client with Galileo integration

2. document retrieval system

The document retrieval function is decorated with Galileo’s logging:

app.py

@log(span_type="retriever")
def retrieve_documents(query: str):
    # TODO: Replace with actual RAG retrieval
    documents = [
        {
            "id": "doc1",
            "text": """
Galileo is an observability platform for LLM applications. It helps developers
monitor, debug, and improve their AI systems by tracking inputs, outputs,
and performance metrics.""",
            "metadata": {
                "source": "galileo_docs",
                "category": "product_overview"
            }
        },
        {
            "id": "doc2",
            "text": """
RAG (Retrieval-Augmented Generation) is a technique that enhances LLM responses
by retrieving relevant information from external knowledge sources before
generating an answer.""",
            "metadata": {
                "source": "ai_techniques",
                "category": "methodology"
            }
        },
        {
            "id": "doc3",
            "text": """
Common RAG challenges include hallucinations, retrieval quality issues,
and context window limitations. Proper evaluation metrics include relevance,
faithfulness, and answer correctness.""",
            "metadata": {
                "source": "ai_techniques",
                "category": "challenges"
            }
        },
        {
            "id": "doc4",
            "text": """
Vector databases like Pinecone, Weaviate, and Chroma are optimized for
storing embeddings and performing similarity searches, making them
ideal for RAG applications.""",
            "metadata": {
                "source": "tech_stack",
                "category": "databases"
            }
        },
        {
            "id": "doc5",
            "text": """
Prompt engineering is crucial for RAG systems. Well-crafted prompts
should instruct the model to use retrieved context, avoid making up
information, and cite sources when possible.""",
            "metadata": {
                "source": "best_practices",
                "category": "prompting"
            }
        }
    ]
    return documents

Key points:

Uses @log decorator with the retriever span type
Returns structured document objects
Includes metadata for tracking sources
Simulates a real document retrieval system

3. RAG pipeline implementation

The core RAG functionality:

app.py

def rag(query: str):
    documents = retrieve_documents(query)

    # Format documents for better readability in the prompt
    formatted_docs = ""
    for i, doc in enumerate(documents):
        formatted_docs += f"Document {i+1} (Source: {doc['metadata']['source']}):\n{doc['text']}\n\n"

    prompt = f"""
    Answer the following question based on the context provided.
    If the answer is not in the context, say you don't know.

    Question: {query}

    Context:
    {formatted_docs}
    """

    try:
        console.print("[bold blue]Generating answer...[/bold blue]")
        response = client.chat.completions.create(
            model="gpt-4o",
            messages=[
                {
                    "role": "system", 
                    "content": """
You are a helpful assistant that answers questions based only on the 
provided context."""
                },
                {
                    "role": "user",
                    "content": prompt
                }
            ],
        )
        return response.choices[0].message.content.strip()
    except Exception as e:
        return f"Error generating response: {str(e)}"

This section:

Retrieves relevant documents
Formats context for the LLM
Constructs a clear prompt
Handles API calls and errors
Uses the gpt-4o model for responses

4. interactive interface

The main application interface:

app.py

def main():
    console.print(Panel.fit(
        "RAG Demo\nThis demo uses a simulated RAG system to answer your questions.",
        title="Galileo RAG Terminal Demo",
        border_style="blue"
    ))

    # Check environment setup
    if logging_enabled:
        console.print("[green]✅ Galileo logging is enabled[/green]")
    else:
        console.print("[yellow]⚠️ Galileo logging is disabled[/yellow]")

    api_key = os.environ.get("OPENAI_API_KEY")
    if api_key:
        console.print("[green]✅ OpenAI API Key is set[/green]")
    else:
        console.print("[red]❌ OpenAI API Key is missing[/red]")
        sys.exit(1)

    # Main interaction loop
    while True:
        query = questionary.text(
            "Enter your question about Galileo, RAG, or AI techniques:",
            validate=lambda text: len(text) > 0
        ).ask()

        if query.lower() in ['exit', 'quit', 'q']:
            break

        try:
            result = rag(query)
            console.print("\n[bold green]Answer:[/bold green]")
            console.print(Panel(Markdown(result), border_style="green"))

            continue_session = questionary.confirm(
                "Do you want to ask another question?",
                default=True
            ).ask()

            if not continue_session:
                break

        except Exception as e:
            console.print(f"[bold red]Error:[/bold red] {str(e)}")

if __name__ == "__main__":
    try:
        main()
    except KeyboardInterrupt:
        console.print("\n[bold]Exiting RAG Demo. Goodbye![/bold]")

This section provides:

Environment validation
Interactive question-answer loop
Rich formatting for outputs
Graceful error handling
Clean exit handling

Key features

Galileo Logging: Track document retrieval and LLM interactions
Rich Console Interface: User-friendly terminal interface
Error Handling: Graceful handling of API and runtime errors
Context Management: Proper formatting of retrieved documents
Interactive Experience: Easy-to-use question-answering interface

Next steps

Implement real document retrieval using a vector database
Add response streaming for better user experience
Implement more sophisticated prompt engineering
Add evaluation metrics for retrieval quality
Integrate advanced Galileo logging features

Agentic AI

Conversational AI

Retrieval-Augmented Generation

Basic RAG Example

What you’ll need

Setup instructions

Implementation guide

1. setting up the environment

2. document retrieval system

3. RAG pipeline implementation

4. interactive interface

Key features

Next steps

Agentic AI

Conversational AI

Retrieval-Augmented Generation

​What you’ll need

​Setup instructions

​Implementation guide

​1. setting up the environment

​2. document retrieval system

​3. RAG pipeline implementation

​4. interactive interface

​Key features

​Next steps

What you’ll need

Setup instructions

Implementation guide

1. setting up the environment

2. document retrieval system

3. RAG pipeline implementation

4. interactive interface

Key features

Next steps