Problem
request_id_middleware reads X-Org-Id (and X-Workspace-Id) straight off the inbound request (src/lib.rs TracingIds::from_headers) and:
- attaches the value to the tracing span — every log line within the request carries it
- forwards it to the upstream vLLM/SGLang engine (via
ProxyOpts.tracing_ids → apply_tracing_headers in src/proxy.rs)
- emits it on the per-request "request completed" structured log line
The assumption baked into PR #130 is that only cloud-api can reach inference-proxy, so X-Org-Id is always set by a trusted hop. If that assumption ever breaks — direct ingress, a misconfigured firewall rule, an attacker who hits the proxy on its raw port — a caller can spoof an arbitrary org_id into both inference-proxy structured logs and vLLM logs. Effects:
- Pollute another tenant's usage view in Datadog (
@org_id:<victim> queries return the attacker's requests)
- Hide their own requests by setting an unowned org_id
- Confuse any downstream pipeline that ingests vLLM logs and trusts the header
Why this isn't fixable today
cloud-api's POST /v1/check_api_key returns only HTTP status (200/402/429/5xx) — see src/auth.rs lines 84–180, which drains the body without parsing it. inference-proxy has no way to learn the authoritative org_id from the API key it just validated, so it can't override the inbound header with a server-side truth value.
Options (rough preference order)
-
Have cloud-api return org_id in the /v1/check_api_key response body, then have inference-proxy use that as the source of truth and either ignore inbound X-Org-Id for sk- requests or hard-fail on mismatch. This is the only fix that survives a misconfigured network ACL.
-
Document the network-ACL assumption in the middleware (`src/lib.rs::request_id_middleware` doc comment) so a future operator doesn't open up direct ingress without thinking about it.
-
For admin-token (config.token) requests, trust the header (these come from a trusted gateway). For sk- requests with no cloud-api-supplied org_id, refuse to log/forward the org_id at all.
Acceptance
- inference-proxy logs and vLLM logs carry an org_id that the caller cannot spoof
- ...or, if we accept the network-ACL approach, the trust boundary is explicitly documented and there's a CI/runtime check that the proxy isn't reachable from outside cloud-api's network
Context
Problem
request_id_middlewarereadsX-Org-Id(andX-Workspace-Id) straight off the inbound request (src/lib.rsTracingIds::from_headers) and:ProxyOpts.tracing_ids→apply_tracing_headersinsrc/proxy.rs)The assumption baked into PR #130 is that only cloud-api can reach inference-proxy, so
X-Org-Idis always set by a trusted hop. If that assumption ever breaks — direct ingress, a misconfigured firewall rule, an attacker who hits the proxy on its raw port — a caller can spoof an arbitraryorg_idinto both inference-proxy structured logs and vLLM logs. Effects:@org_id:<victim>queries return the attacker's requests)Why this isn't fixable today
cloud-api'sPOST /v1/check_api_keyreturns only HTTP status (200/402/429/5xx) — seesrc/auth.rslines 84–180, which drains the body without parsing it. inference-proxy has no way to learn the authoritative org_id from the API key it just validated, so it can't override the inbound header with a server-side truth value.Options (rough preference order)
Have cloud-api return
org_idin the/v1/check_api_keyresponse body, then have inference-proxy use that as the source of truth and either ignore inboundX-Org-Idforsk-requests or hard-fail on mismatch. This is the only fix that survives a misconfigured network ACL.Document the network-ACL assumption in the middleware (`src/lib.rs::request_id_middleware` doc comment) so a future operator doesn't open up direct ingress without thinking about it.
For admin-token (config.token) requests, trust the header (these come from a trusted gateway). For
sk-requests with no cloud-api-supplied org_id, refuse to log/forward the org_id at all.Acceptance
Context