Skip to main content

What are custom integrations?

Galileo’s custom integrations provide a flexible way of setting up LLMs that aren’t supported through other existing integrations. For example, when there are non-standard proxies, proprietary authentications, or proprietary inference protocols. This video demonstrates how to add a custom integration to Galileo:

Configuring custom integrations in the Galileo console

1

Navigate to Integrations

Navigate to Settings > Integrations in the Galileo console.
2

Add Custom Integration

Find the Custom integration card and click Add IntegrationCustom integration in Galileo console
3

Configure Integration

Paste a valid JSON and Save changes.

This JSON example uses PortKey and Mistral with API key authentication.
See more JSON examples
{
  "authentication_type": "api_key",
  "endpoint": "https://api.portkey.ai/v1",
  "api_key_header": "x-portkey-api-key",
  "api_key_value": "YOUR_PORTKEY_API_KEY",
  "models": [
    "@mistral/mistral-tiny",
    "@mistral/mistral-large-latest"
  ],
  "default_model": "@mistral/mistral-tiny"
}
Pasting the JSON payload
4

Test Integration

After saving, test the integration by selecting one of its models from a Playground.Selecting the custom integration model in the PlaygroundRun a prompt or evaluation that uses the custom integration’s model.Testing the custom integrationNote: The models will also be available for use with metrics in Galileo.

JSON examples

API key authentication integration

For providers that use API key authentication (for example, Together AI or Portkey):
{
  "authentication_type": "api_key",
  "endpoint": "https://api.together.xyz/v1",
  "api_key_header": "Authorization",
  "api_key_value": "Bearer YOUR_TOGETHER_API_KEY",
  "models": [
    "openai/gpt-oss-20b",
    "meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8",
    "Qwen/Qwen3-Next-80B-A3B-Instruct",
    "zai-org/GLM-4.5-Air-FP8"
  ],
  "default_model": "meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8"
}
{
  "authentication_type": "api_key",
  "endpoint": "https://api.portkey.ai/v1",
  "api_key_header": "x-portkey-api-key",
  "api_key_value": "YOUR_PORTKEY_API_KEY",
  "models": [
    "@mistral/mistral-tiny",
    "@mistral/mistral-large-latest"
  ],
  "default_model": "@mistral/mistral-tiny"
}
  • Refer to your AI provider documentation for more info about the expected format for the api_key_header and api_key_value.
  • Many AI providers require an Authorization header with a Bearer prefix in the API key value. However, some API providers don’t have these requirements.

OAuth2 integration

For providers that require dynamically-generated bearer tokens:
{
  "authentication_type": "oauth2",
  "authentication_scope": "chat.completions",
  "endpoint": "https://api.custom-provider.com/v1",
  "oauth2_token_url": "https://api.custom-provider.com/oauth2/token",
  "token": "{\"client_id\": \"my_client_id\", \"client_secret\": \"my_client_secret\"}",
  "models": ["custom-model-1", "custom-model-2"],
  "default_model": "custom-model-1"
}
Your oauth2_token_url must support the OAuth2 Client Credentials Grant (RFC 6749 §4.4). Galileo exchanges the provided client_id and client_secret for an access token, which is then used as a Bearer token for LLM inference requests.

No authentication integration

For internal endpoints or providers that don’t require authentication:
{
  "authentication_type": "none",
  "endpoint": "https://internal-gateway.local/v1",
  "models": ["internal-model-v1", "internal-model-v2"],
  "default_model": "internal-model-v1"
}
The above examples cover the most common use cases. Most users won’t need to go beyond this point. The sections below are for advanced scenarios like custom LLM handlers and single-tenant deployments.
The schema used in above examples comes from the Custom Integrations API. Refer to the API reference for the full list of available fields and options.

Troubleshooting ideas

  • Make sure you have valid authentication credentials (e.g. an API key).
  • Make sure the model name is exactly as specified through the provider.
  • Make sure that requests are being sent to the provider’s endpoint.
As an example, use this curl command to verify that an API key and model for Portkey has been configured correctly.
curl https://api.portkey.ai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "x-portkey-api-key: $PORTKEY_API_KEY" \
  -d '{
    "model": "YOUR_MODEL",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "What is Portkey"}
    ],
    "max_tokens": 512
  }'

Advanced usage: custom LLM handlers

Custom LLM handlers are only available on single-tenant Galileo deployments. They require api v1.848.0+ and runners v2.239.0+.
The JSON examples work when your LLM provider exposes a standard OpenAI-compatible /chat/completions endpoint. However, some providers use proprietary request formats, non-standard response structures, or custom authentication flows that can’t be handled by configuration alone. For these cases, you can write a custom LLM handler — a Python class that gives you full control over how Galileo sends requests to your model and interprets the responses.

When to use a custom handler

  • Your provider’s API doesn’t follow the OpenAI /chat/completions format
  • You need to transform requests or responses (e.g., different payload structure, custom headers)
  • Your authentication flow goes beyond OAuth2 or API keys (e.g., signed requests, mTLS)
  • You need custom retry logic or error handling

Writing a handler

Create a Python file with a class that extends litellm.CustomLLM. Your class must implement the acompletion method, which receives the standard LiteLLM inputs and must return a ModelResponse:
# File: proprietary_handler.py

from litellm import CustomLLM
from litellm.types.utils import ModelResponse
import httpx


class ProprietaryLLMHandler(CustomLLM):
    """Custom handler for proprietary LLM API."""

    def __init__(self, timeout: int = 30, retry_count: int = 1):
        self.timeout = timeout
        self.retry_count = retry_count

    async def acompletion(
        self,
        model: str,
        messages: list,
        api_base: str,
        custom_llm_provider: str,
        **kwargs
    ) -> ModelResponse:
        """Handle async completion requests to the proprietary API."""

        # Transform messages to provider's format
        payload = self._transform_request(messages, **kwargs)

        # Make the async API call
        async with httpx.AsyncClient() as client:
            response = await client.post(
                f"{api_base}/generate",
                json=payload,
                timeout=self.timeout
            )
            response.raise_for_status()

        # Transform response to LiteLLM format
        return self._transform_response(response.json(), model)

    def _transform_request(self, messages: list, **kwargs) -> dict:
        """Transform LiteLLM messages to provider format."""
        return {
            "prompt": messages,
            "temperature": kwargs.get("temperature", 0.7),
            "max_tokens": kwargs.get("max_tokens", 1024)
        }

    def _transform_response(self, data: dict, model: str) -> ModelResponse:
        """Transform provider response to LiteLLM format."""
        return ModelResponse(
            id=data.get("id", "response-id"),
            choices=[{
                "message": {
                    "role": "assistant",
                    "content": data["output"]
                },
                "finish_reason": "stop"
            }],
            model=model,
            usage={
                "prompt_tokens": data.get("input_tokens", 0),
                "completion_tokens": data.get("output_tokens", 0),
                "total_tokens": data.get("total_tokens", 0)
            }
        )

Configuring the handler in your integration payload

Reference your handler using the custom_llm_config field:
file_name
string
required
Python file containing the CustomLLM class (e.g., "proprietary_handler.py").
class_name
string
required
Class name (must be a litellm.CustomLLM subclass).
init_kwargs
object
Keyword arguments passed to the handler’s constructor.
Example payload — custom handler without authentication:
{
  "authentication_type": "none",
  "models": ["proprietary-model"],
  "endpoint": "https://ai.example.com/inference",
  "default_model": "proprietary-model",
  "custom_llm_config": {
    "file_name": "proprietary_handler.py",
    "class_name": "ProprietaryLLMHandler",
    "init_kwargs": {
      "timeout": 60,
      "retry_count": 3
    }
  }
}
Example payload — custom handler with OAuth2:
{
  "authentication_type": "oauth2",
  "models": ["enterprise-llm-v2"],
  "endpoint": "https://enterprise-ai.company.com/api/v2",
  "token": "{\"client_id\": \"enterprise_app\", \"client_secret\": \"secret_xyz\"}",
  "default_model": "enterprise-llm-v2",
  "authentication_scope": "inference.execute",
  "oauth2_token_url": "https://auth.company.com/oauth2/token",
  "custom_llm_config": {
    "file_name": "enterprise_handler.py",
    "class_name": "EnterpriseLLMAdapter"
  }
}

Deploying handler files

Handler files must be placed on the Galileo runners container filesystem.
GALILEO_CUSTOM_LLMS_ENABLED
boolean
default:"false"
Set to true to enable custom LLM support.
GALILEO_CUSTOM_LLMS_DIRECTORY
string
default:"/opt/custom_llms"
Directory where handler files are located.
Place your .py files directly in the configured directory (nested paths are not supported):
/opt/custom_llms/proprietary_handler.py
/opt/custom_llms/enterprise_adapter.py
Deployment options:
  1. Volume mount — Mount a volume containing your handler files at /opt/custom_llms
  2. Custom image — Build a custom runner image with handler files copied in
  3. Custom directory — Set GALILEO_CUSTOM_LLMS_DIRECTORY to a different path

Security notes

  • Tokens are encrypted before storage
  • OAuth2 client credentials should be kept confidential and rotated regularly
  • Custom LLM handler files should be reviewed for security before deployment