Configuration reference

All router configuration is via environment variables (12-factor). This page is the exhaustive reference; the README has the 60-second quickstart.

Provider API keys
Postgres
Server
Routing
BYOK encryption
Telemetry (OpenTelemetry)
Cluster-routing artifacts

Provider API keys

The router registers each upstream provider only when its API key is present in the environment. Anthropic is special: when ANTHROPIC_API_KEY is unset, the router still registers the provider but forwards Anthropic auth headers (Authorization / x-api-key) to api.anthropic.com directly. This lets Claude Code keep using the user's logged-in plan.

Variable	Default	Effect
`OPENROUTER_API_KEY`	(none)	Recommended baseline. Enables OpenRouter and the full OSS-model pool the cluster scorer is trained against.
`OPENROUTER_BASE_URL`	`https://openrouter.ai/api/v1`	Override for OpenRouter or any OpenAI-compatible endpoint (vLLM, Together, Fireworks, self-hosted).
`ANTHROPIC_API_KEY`	(none — passthrough)	Router's own Anthropic key. When unset, client `Authorization` headers pass through.
`OPENAI_API_KEY`	(none)	Enables the OpenAI provider (Chat Completions API).
`OPENAI_BASE_URL`	`https://api.openai.com`	Override for OpenAI (e.g. Azure OpenAI).
`GOOGLE_API_KEY`	(none)	Enables Gemini via its OpenAI-compatible endpoint.
`GOOGLE_BASE_URL`	`https://generativelanguage.googleapis.com/v1beta/openai`	Override for Gemini.

BYOK (per-installation keys). Instead of (or in addition to) the env vars above, each installation can supply its own provider keys via the dashboard. Those are stored in Postgres and used only for that installation's traffic. See BYOK encryption.

Postgres

Set DATABASE_URL directly, or compose it from the individual vars:

Variable	Default	Purpose
`DATABASE_URL`	(none)	Full connection string (takes precedence).
`POSTGRES_USER`	(required if no `DATABASE_URL`)	Username.
`POSTGRES_PASSWORD`	(required if no `DATABASE_URL`)	Password.
`POSTGRES_DB`	(required if no `DATABASE_URL`)	Database name.
`POSTGRES_HOST`	(required if no `DATABASE_URL`)	Hostname.
`POSTGRES_PORT`	`5432`	Port.
`POSTGRES_SSLMODE`	`require`	TLS mode. Use `disable` for local Docker.
`POSTGRES_CONNECTION_NAME`	(none)	Cloud SQL Auth Proxy instance connection name.

Server

Variable	Default	Purpose
`PORT`	`8080`	HTTP listen port.
`ROUTER_DEPLOYMENT_MODE`	`selfhosted`	`selfhosted` mounts `/ui/` and `/admin/v1/`. `managed` skips both (for SaaS deployments with a separate admin UI).
`ROUTER_ADMIN_PASSWORD`	`admin`	Dashboard password. Defaults to `admin` with a startup warning when unset — set this for any internet-facing deployment.

Routing

Variable	Default	Purpose
`ROUTER_CLUSTER_VERSION`	(reads `artifacts/latest`)	Pin a specific cluster artifact version (e.g. `v0.27`).
`ROUTER_CLUSTER_EMBED_TIMEOUT_MS`	`200`	Per-request ONNX embed timeout. Increase for slower hosts.
`ROUTER_EMBED_ONLY_USER_MESSAGE`	`true`	Feed only user-role text to the embedder. Set `false` to embed the full concatenated turn.
`ROUTER_STICKY_DECISION_TTL_MS`	`0` (disabled)	Reuse a routing decision per API key for this many ms.
`ROUTER_SESSION_PIN_ENABLED`	`true`	Pin a session to its first-routed model so multi-turn conversations stay coherent.
`ROUTER_HARD_PIN_MODEL`	(none)	Force every request to a specific model, bypassing the cluster scorer. Debugging only.
`ROUTER_HARD_PIN_PROVIDER`	(none)	Pair with `ROUTER_HARD_PIN_MODEL`.
`ROUTER_ONNX_ASSETS_DIR`	`/opt/router/assets`	Directory containing `model.onnx` + `tokenizer.json`.
`ROUTER_ONNX_LIBRARY_DIR`	(system default)	Path to `libonnxruntime` (e.g. `/opt/homebrew/lib` on Apple Silicon).

If the cluster scorer can't run (missing model, embed timeout, etc.), the router returns HTTP 503 — it does not silently fall back to a default model. Failures are loud by design.

BYOK encryption

Variable	Default	Purpose
`EXTERNAL_KEY_ENCRYPTION_KEY`	(unset)	Tink AES-256-GCM keyset (JSON) that encrypts customer-supplied upstream provider keys at rest.

If unset, BYOK secrets are stored unencrypted and the router logs a WARN at startup. Set this in any deployment that handles real customer secrets. Generate with:

tinkey create-keyset --key-template AES256_GCM --out-format json

A malformed keyset still fails closed (the router refuses to boot); only a genuinely absent value triggers the unencrypted bypass.

Telemetry (OpenTelemetry)

The router exports per-request trace spans to any OTLP-compatible collector. Each proxied request emits two spans (router.decision and router.upstream) with routing decisions, token usage, cost estimates, and latency. Export is async/non-blocking; when OTEL_EXPORTER_OTLP_ENDPOINT is unset, OTel is fully disabled at zero runtime cost.

Variable	Default	Purpose
`OTEL_EXPORTER_OTLP_ENDPOINT`	(disabled)	Collector base URL (e.g. `https://api.honeycomb.io`). Required to enable.
`OTEL_EXPORTER_OTLP_HEADERS`	(none)	Comma-separated `key=value` headers (e.g. auth tokens).
`OTEL_EXPORTER_OTLP_TIMEOUT`	`10000`	Per-export HTTP timeout in ms.
`OTEL_SERVICE_NAME`	`router`	`service.name` resource attribute.
`OTEL_RESOURCE_ATTRIBUTES`	(none)	Comma-separated `key=value` resource attributes.
`OTEL_BSP_MAX_QUEUE_SIZE`	`1000`	Span queue capacity. Spans drop when full.
`OTEL_BSP_MAX_EXPORT_BATCH_SIZE`	`50`	Max spans per OTLP POST.
`OTEL_BSP_SCHEDULE_DELAY`	`500`	Partial-batch flush interval in ms.
`OTEL_EXPORT_WORKERS`	`2`	Export-goroutine count.

The first five follow the OTel SDK env spec; OTEL_BSP_* follows the Batch Span Processor spec. OTEL_EXPORT_WORKERS is a router-specific extension.

Cluster-routing artifacts

The cluster scorer needs two files at runtime: model.onnx (the INT8-quantized embedder) and tokenizer.json. Neither is committed to git — both come from the public jinaai/jina-embeddings-v2-base-code HuggingFace repo. We use Jina's own INT8 export; we don't maintain our own quantization.

Docker (default): the Dockerfile downloads both files at image build time into /opt/router/assets/. Nothing for you to do.

make dev (host-mode hot reload): fetch them once into a local directory and point ROUTER_ONNX_ASSETS_DIR at it:

mkdir -p assets
BASE="https://huggingface.co/jinaai/jina-embeddings-v2-base-code/resolve/516f4baf13dec4ddddda8631e019b5737c8bc250"
curl -L "$BASE/onnx/model_quantized.onnx" -o assets/model.onnx
curl -L "$BASE/tokenizer.json" -o assets/tokenizer.json
echo "ROUTER_ONNX_ASSETS_DIR=$(pwd)/assets" >> .env.local

The pinned revision matches HF_MODEL_REVISION in the Dockerfile, so local dev and the container build use the same weights. Bump both together if you want a newer upstream export.

The committed cluster artifacts (centroids, rankings, model registry, metadata) live under internal/router/cluster/artifacts/v<X.Y>/. The artifacts/latest pointer selects the default served version; ROUTER_CLUSTER_VERSION overrides per-deployment.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Configuration reference

Table of contents

Provider API keys

Postgres

Server

Routing

BYOK encryption

Telemetry (OpenTelemetry)

Cluster-routing artifacts

FilesExpand file tree

CONFIGURATION.md

Latest commit

History

CONFIGURATION.md

File metadata and controls

Configuration reference

Table of contents

Provider API keys

Postgres

Server

Routing

BYOK encryption

Telemetry (OpenTelemetry)

Cluster-routing artifacts