Headroom + LiteLLM Gateway for Anthropic, OpenRouter, and Gemini

This setup provides a single OpenAI-compatible endpoint that routes requests to multiple upstream LLM providers through LiteLLM, while using Headroom to reduce API footprint by compressing and optimizing request context before it is sent upstream.

Goals

Single local gateway endpoint for all clients
Support for:
- Anthropic
- OpenRouter
- Google Gemini
Headroom-based context optimization on top of LiteLLM
Docker Compose deployment
Easy model aliasing for coding tools and agent frameworks

Architecture

The recommended layout is:

LiteLLM acts as the main OpenAI-compatible gateway
Headroom is installed into the same container
Headroom is enabled as a LiteLLM callback
Clients connect only to LiteLLM
LiteLLM routes requests to the correct upstream provider

Why this design?

This is the most robust setup because:

LiteLLM has broad provider support and is actively maintained
Headroom can integrate directly into LiteLLM
Gemini works best when routed directly via LiteLLM instead of going through OpenRouter
OpenRouter is still available for models you specifically want there
Clients only need one base URL and one API key for the local gateway

Recommended provider routing

Use the following policy unless you have a strong reason to do otherwise:

Anthropic: direct via anthropic/...
Gemini: direct via gemini/...
OpenRouter: via openrouter/...

Why not Gemini through OpenRouter by default?

Gemini can often work through OpenRouter, but it adds another translation layer:

client -> LiteLLM -> OpenRouter -> Gemini

That increases the chance of compatibility issues, especially for chat formatting, tools, or multimodal requests. Direct Gemini routing through LiteLLM is usually cleaner and more predictable.

Project structure

llm-gateway/
├─ docker-compose.yaml
├─ Dockerfile
├─ litellm_config.yaml
├─ .env              # private — contains API keys, never commit
├─ .env.example      # template with placeholders
├─ requirements.txt  # Python test dependencies (openai, anthropic, google-genai, httpx, python-dotenv)
├─ test_e2e.py       # E2E integration test suite
└─ data/
   └─ headroom/
       └─ .gitkeep  # SQLite store for Headroom context optimization

Quick start

# 1. Copy and fill in your API keys
cp .env.example .env
# edit .env with real keys

# 2. Build and start the gateway
docker compose up --build -d

# 3. Wait for the gateway to be healthy
curl http://localhost:4000/health \
  -H "Authorization: Bearer sk-your-actual-master-key"

# 4. Run the E2E test suite
pip install -r requirements.txt
python test_e2e.py

Available model aliases

Alias	Upstream provider + model
`claude-sonnet`	Anthropic `claude-sonnet-4-6`
`claude-opus`	Anthropic `claude-opus-4-6`
`gemini-flash`	Google Gemini `gemini-3-flash-preview`
`gemini-pro`	Google Gemini `gemini-3.1-pro-preview`
`openrouter-minimax`	OpenRouter `minimax/minimax-m2.7`

Testing

Anthropic

curl http://localhost:4000/v1/chat/completions \
  -H "Authorization: Bearer sk-change-me" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet",
    "messages": [
      {"role": "user", "content": "Say hello in one sentence."}
    ]
  }'

Google

curl http://localhost:4000/v1/chat/completions \
  -H "Authorization: Bearer sk-change-me" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gemini-flash",
    "messages": [
      {"role": "user", "content": "Summarize why direct Gemini routing is useful."}
    ]
  }'

OpenRouter

curl http://localhost:4000/v1/chat/completions \
  -H "Authorization: Bearer sk-change-me" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openrouter-minimax",
    "messages": [
      {"role": "user", "content": "Write a short Go function that reverses a string."}
    ]
  }'

E2E test suite

The test_e2e.py script validates both direct-provider connectivity and end-to-end gateway routing for all 5 models:

# Full suite (direct + gateway tests)
python test_e2e.py

# Direct provider tests only (no Docker needed)
python test_e2e.py --direct-only

# Gateway tests only (assumes container is already running)
python test_e2e.py --gateway-only --skip-health

# Custom .env path
python test_e2e.py --env /path/to/.env

Exit code 0 = all tests passed. Exit code 1 = at least one failure (check printed results).

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
callbacks.py		callbacks.py
docker-compose.yaml		docker-compose.yaml
litellm_config.yaml		litellm_config.yaml
requirements.txt		requirements.txt
stats_server.py		stats_server.py
test_e2e.py		test_e2e.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Headroom + LiteLLM Gateway for Anthropic, OpenRouter, and Gemini

Goals

Architecture

Why this design?

Recommended provider routing

Why not Gemini through OpenRouter by default?

Project structure

Quick start

Available model aliases

Testing

Anthropic

Google

OpenRouter

E2E test suite

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Headroom + LiteLLM Gateway for Anthropic, OpenRouter, and Gemini

Goals

Architecture

Why this design?

Recommended provider routing

Why not Gemini through OpenRouter by default?

Project structure

Quick start

Available model aliases

Testing

Anthropic

Google

OpenRouter

E2E test suite

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages