Mirror notice. Verbatim sync with AGENTS.md. Claude Code reads
CLAUDE.md; Cursor + generic agents readAGENTS.md. Update both together — divergence = bug.
Root guide for AI agents in the router/ subproject. Covers cross-cutting design + the layer model. First read for any task: README, then this file. Then read the CLAUDE.md inside the package you're editing — each subpackage has its own with focused recipes + invariants.
- Patterns of Enterprise Application Architecture (Fowler)
- Designing Data-Intensive Applications (Kleppmann)
- Design Patterns (GoF)
- CLEAN architecture (Martin) — especially dependency inversion
- DRY
- Small expert team — explicit composition, readable wiring; reject DI containers, reflection, framework magic
- Concise comments, sparingly — default to none. Only when why is non-obvious (hidden constraint, subtle invariant, workaround, surprising behavior). Never rehash code, never reference current task/PR/caller, no multi-paragraph. If removing wouldn't confuse, don't write.
- Non-tautological tests — every test must assert behavior that breaks if prod code deleted.
Three concentric layers. Imports flow inward only.
+-------------------------------------------------------------------+
| cmd/router/main.go (composition root — wires all) |
| |
| +-------------------------------------------------------------+ |
| | internal/api/admin (presentation: /health, /validate,| |
| | /admin/v1/*) | |
| | internal/api/anthropic (/v1/messages, passthrough, | |
| | /v1/route) | |
| | internal/api/openai (/v1/chat/completions) | |
| | internal/api/gemini (/v1beta/models/:modelAction) | |
| | internal/server (route registration) | |
| | internal/server/middleware (auth, timeout, cluster/embed | |
| | overrides, OTel timing) | |
| | internal/postgres (adapter: SQLC over pgx; also | |
| | session-pin store impl) | |
| | internal/sqlc (generated; regenerate via | |
| | `make generate`) | |
| | internal/router/cluster (Router impl: AvengersPro, | |
| | Multiversion) | |
| | internal/providers/* (Client impls: anthropic, openai, | |
| | google native, openaicompat, | |
| | noop, httputil) | |
| | internal/observability/otel (span emitter; adapter) | |
| | | |
| | +-------------------------------------------------------+ | |
| | | internal/auth (identity domain: types, | | |
| | | repos, Service.VerifyAPIKey, | | |
| | | APIKeyCache, id/hashing, Tink | | |
| | | encryptor) | | |
| | | internal/proxy (routing/dispatch service: | | |
| | | Route, ProxyMessages, | | |
| | | ProxyOpenAIChatCompletion, | | |
| | | ProxyGemini; turn loop, | | |
| | | handover adapter, cache writer, | | |
| | | session-key derivation) | | |
| | | internal/proxy/usage (Anthropic unified-limit | | |
| | | observer for usage-bypass) | | |
| | | internal/router (Router iface + Request/Decision | | |
| | | + ModelSpec/ModelRegistry) | | |
| | | internal/router/cache (semantic response cache) | | |
| | | internal/router/catalog (single source of truth | | |
| | | for per-model data: | | |
| | | tier, provider bindings, | | |
| | | pricing, cost math) | | |
| | | internal/router/handover (Summarizer iface + | | |
| | | envelope rewrite) | | |
| | | internal/router/planner (cache-aware EV policy) | | |
| | | internal/router/sessionpin (Pin types + Store iface) | | |
| | | internal/router/turntype (turn-type detector) | | |
| | | internal/providers (Client iface + types + canonical | | |
| | | Provider* name constants) | | |
| | | internal/translate (cross-format wire-format | | |
| | | conversion: OpenAI <-> Anthropic | | |
| | | <-> Gemini; pure, no I/O) | | |
| | | internal/sse (zero-alloc SSE framing helpers) | | |
| | +-------------------------------------------------------+ | |
| +-------------------------------------------------------------+ |
| |
| internal/config (env helpers: MustGet, GetOr) |
| internal/observability (slog logger + gin middleware) |
+-------------------------------------------------------------------+
- Layering is load-bearing. Imports flow inward only. Inner-ring packages must not import adapter or presentation packages; adapters never import each other; only
cmd/router/main.goconstructs concrete things. Inner-ring packages may import each other (e.g.proxy.Service.Routereturnsrouter.Decision;proxy.Servicecallstranslate,sessionpin,planner,handover,cache,pricing,capability,turntype,usageto compose a turn). - Small utility third-party libs allowed at every layer. Layering = about where I/O and behavior live, not banning go.mod entries. Reach for vetted small lib (
golang-lru,uuid, error helpers) before rolling own. Reject heavyweight frameworks (DI containers, ORMs, metaprogramming kits). - Inner-ring packages are I/O-free.
internal/router,internal/providers,internal/translate,internal/sse,internal/router/{cache,capability,handover,planner,pricing,sessionpin,turntype},internal/proxy/usagedefine interfaces, value types, pure functions only. Adding I/O method (HTTP, DB, queue, FS) = layering violation; put onauth.Service/proxy.Serviceor adapter subpackage. Pure-Go utility libs fine. - Adapters depend only on inner ring.
internal/postgresmay also importinternal/sqlc. Adapters never import each other —internal/postgresdoesn't knowinternal/api/adminetc. Note: provider adapters (internal/providers/<name>/) importinternal/proxyforOnUpstreamMetacallback so streaming responses record usage/headers back to proxy — one of few inward-pointing adapter→inner-ring imports, intentional. internal/api/*andinternal/serverdepend oninternal/auth(Service handle + middleware-context types) andinternal/proxy(routing/dispatch service handle). May importinternal/observabilityfor logging,internal/providersfor shared sentinel errors,internal/router/clusterforErrClusterUnavailablesentinel +DeployedModelsSourceinterface (API handlers map sentinel → HTTP 503). Must not importinternal/postgres, any concreteinternal/providers/*adapter, orinternal/translatedirectly. Concrete instances reach presentation only via constructor params from composition root.internal/configandinternal/observabilityare leaf utilities — must not import any other package underinternal/. Third-party utility deps fine; today pull only stdlib + gin (request-scoped logger middleware).internal/observability/otelsubpackage is an adapter (builds real OTLP exporter) and can import other internal packages; parentinternal/observabilitystays a leaf.- Composition happens in
cmd/router/main.go. Only file that constructs concrete adapters + injects them. No other place wires things. Seecmd/CLAUDE.md.
If wanting to import something that violates these rules, design is wrong — surface as interface in appropriate inner-ring package and implement in adapter subpackage.
Pick by responsibility, then read that package's CLAUDE.md:
| Responsibility | Package | Guide |
|---|---|---|
| HTTP endpoint (handler + route) | internal/api/<group>/ |
internal/api/CLAUDE.md |
| Identity / API-key logic | internal/auth (method on *Service) |
internal/auth/CLAUDE.md |
| Routing / dispatch / per-turn orchestration | internal/proxy (method on *Service) |
internal/proxy/CLAUDE.md |
| Cross-format wire conversion (no I/O) | internal/translate |
internal/translate/CLAUDE.md |
| New upstream provider | internal/providers/<name>/ |
internal/providers/CLAUDE.md |
New Router implementation |
internal/router/<name>/ |
internal/router/CLAUDE.md |
| Cluster scorer / artifacts | internal/router/cluster/ |
internal/router/cluster/CLAUDE.md |
| Cache-aware turn routing internals | internal/router/{planner,handover,cache,sessionpin,pricing,capability,turntype}/ |
each has its own CLAUDE.md |
| Anthropic usage-bypass gate | internal/proxy/usage |
internal/proxy/usage/CLAUDE.md |
| New column / SQL query | db/queries/ + internal/postgres/ |
db/CLAUDE.md, internal/postgres/CLAUDE.md |
Doc under docs/ |
docs/ |
docs/CLAUDE.md |
Default rule: put logic in the package that uses it. Only promote to a shared home (auth, proxy, translate, config, observability, sse) when 3+ packages need the same logic.
Don't, unless same logic needed in 3+ places and no plausible existing home. Canonical homes:
- Auth helpers (token prefix, ID gen, hashing, encryption) →
internal/authalongside types they support. - Env parsing →
internal/config. - Logging / tracing →
internal/observability(OTel exporter inotelsubpackage). - SSE framing →
internal/sse.
If new helper doesn't fit, justify new package in code comment before creating.
- No magic strings for provider/model names. Use named constants from
internal/providers(providers.ProviderAnthropic,providers.ProviderOpenAI,providers.ProviderGoogle,providers.ProviderOpenRouter,providers.ProviderFireworks) everywhere provider names appear as values — map keys, switch cases,router.Decision.Providerliterals, log fields, test fixtures. For new model name constants, add to appropriate package before use. Bare string literals = review-blocking. - Keep files small. Split distinct logic into separate files, especially when shared between multiple places.
- Avoid unnecessary nesting — flatten conditionals with early returns + combined conditions.
- All exported symbols carry godoc starting with symbol name (
Foo does Xor// Foo is …). - Errors flow up. Don't swallow; don't log-and-continue on request path.
fireMarkUsedin service.go is the one documented exception (best-effort, off request path). - Use
errors.Is/errors.As, never==or!=on errors. For no-rows checks:errors.Is(err, sql.ErrNoRows). - Use
slog(viaobservability.Get/observability.FromGin), notfmt.Printlnorlog.Print*. - Sentinel errors typed (
var ErrFoo = errors.New(...)) + live in same package as function that returns them. HTTP layer maps to status codes; do not export HTTP semantics from inner-ring packages. - Constructor injection over package-level singletons. Inject clock (
auth.Clock = func() time.Time), logger, HTTP client, etc.
- Tests live next to code (
foo_test.gonext tofoo.go). Prefer<pkg>_testexternal test packages so public API exercised; use internal package only when test needs unexported state (*_internal_test.gofiles ininternal/proxyare canonical). - Real assertions only. Compare value code-under-test produced to value test author chose. Tautological assertions (
x == x, "constructor returns instance", "mock called with X") rejected. - Use
testify/assert+testify/require. Userequire.Eventuallyfor async (see service_test.gofireMarkUsedassertion). - In-memory fakes for repos/routers/provider clients are cheap + far better than mocks for unit testing Service.
- No DB-backed integration tests in
internal/. If need real Postgres,docker composestack is runtime fixture; write scripts underscripts/rather than*_test.go.
- Log message explains in plain English what happened. Include
err+ relevant context (IDs, counts, status codes). - Keep log statements on single line with all args inline.
- Use
log.With("key", value)to attach repeated context once, rather than repeating same key-value on every call. - snake_case for log attribute keys (
api_key_id, notapiKeyID). - Log
Debugfor routine ops (auth checks, repo calls);Infofor major business events (server start, key issuance). ReserveErrorfor genuine failures needing on-call attention; auth-401 isDebug, notError. - Never log raw bearer tokens or hashes. 8-char prefix + 4-char suffix (
KeyPrefix/KeySuffixcolumns onauth.APIKey) are safe; full token is not.
- Never import code from outside this subproject. Router is standalone Go module (
module workweave/router) with no cross-project deps. If need utility from elsewhere in monorepo, copy into appropriateinternal/package with own godoc. - Never write raw SQL outside
db/queries/or callpgx.Pooldirectly from anywhere exceptinternal/postgres/. SQLC is only data mapper. - Never reach across layers. Handler in
internal/api/calling*sqlc.Queriesdirectly = layering violation; surface Service method instead. Repo calling another repo = layering violation; put orchestration inauth.Service/proxy.Service. - Never add FKs to tables outside router's own schema. Such tables don't exist in this project.
organization_id+created_by= opaque external strings, not FKs. - Never panic on request path. Reserve
panicfor startup-time fail-fast (config.MustGet, cluster-scorer boot failure, invalidROUTER_DEPLOYMENT_MODE) where misconfiguration must abort process. - Never introduce DI container, reflection-based wiring, or service locator. Composition = plain Go function calls in
cmd/router/main.go. - Never log secrets, raw API keys, or full request bodies. First 8 + last 4 chars on
auth.APIKeyare safe form. BYOK secrets at rest go throughauth.Encryptor(Tink AES-256-GCM); plaintext only in memory for request lifetime. - Never edit generated files under
internal/sqlc/. Regenerate withmake generate. SQLC's "DO NOT EDIT" header is load-bearing.
ROUTER_DEPLOYMENT_MODE read at boot in cmd/router/main.go:
selfhosted(default): full dashboard at/ui/*,/admin/v1/*API (auth, metrics, keys, provider-keys, config, excluded-models), dashboard cookie auth all mounted. Provider keys read from env vars; missing keys keep providers registered for client-passthrough but exclude from hard-pin resolution.managed: dashboard +/admin/v1/*not mounted at all — Weave-managed deploys have separate control plane. Every provider registered with empty deployment key; proxy service in BYOK-only mode, so request without BYOK or client-supplied auth for chosen provider 400s rather than silently spending platform budget. Setting variable to any other value panics at boot.
When adding new endpoint, put inside selfhosted block in server.Register unless part of product surface (/v1/*, /v1beta/*, /health, /validate). Do not re-expose admin surface in managed mode.
The eval harness is a sibling Poetry package, not in this repo — lives at router-internal/eval/ in the WorkWeave monorepo and runs as a Modal app. It exercises the router via staging headers; see that package's README.
Per-request router selection (server side):
internal/server/middleware.WithClusterVersionOverridereadsx-weave-cluster-version: v0.Xheader + stashes version on request context.cluster.Multiversion.Routereads viacluster.VersionFromContext+ dispatches to matchingScorer. Customer traffic (no header) always serves deployment's default version (ROUTER_CLUSTER_VERSION→artifacts/latest).WithEmbedOnlyUserMessageOverridehonorsx-weave-embed-only-user-message: true|falseheader, flipping proxy between embedding user-role text only (default) + concatenated turn stream.
What to NOT do:
- Do NOT re-introduce heuristic-vs-cluster A/B switch. Heuristic retired because silent-fallback behavior masked cluster regressions. If need to compare strategies, ship alternate strategy as another
internal/router/Xpackage + promote on own merits, not as runtime fallback. - Do NOT add runtime override that picks specific model directly. Whole point is comparing router strategies, not per-request model overrides.
- "Should I commit
internal/sqlc/?" Yes. Dockerfile + CI builds depend on generated code being present. Runmake generatebefore committing migration or query changes. - "How do I run one-off query against local DB?"
docker compose exec postgres psql -U router -d router. Migrate step has already applied schema. (Router lives inrouterschema; pool'sAfterConnecthook pinssearch_pathso accidental writes topublic.*are impossible.) - "How do I add new OpenAI-compatible upstream?" Add
*BaseURLconstant tointernal/providers/openaicompat, provider-name constant + env-var entry tointernal/providers/provider.go, registration block incmd/router/main.go. No new adapter package. See internal/providers/CLAUDE.md.