Skip to content

Trust boundary: inbound X-Org-Id is forwarded and logged without validation #132

Description

@Evrard-Nil

Problem

request_id_middleware reads X-Org-Id (and X-Workspace-Id) straight off the inbound request (src/lib.rs TracingIds::from_headers) and:

  1. attaches the value to the tracing span — every log line within the request carries it
  2. forwards it to the upstream vLLM/SGLang engine (via ProxyOpts.tracing_idsapply_tracing_headers in src/proxy.rs)
  3. emits it on the per-request "request completed" structured log line

The assumption baked into PR #130 is that only cloud-api can reach inference-proxy, so X-Org-Id is always set by a trusted hop. If that assumption ever breaks — direct ingress, a misconfigured firewall rule, an attacker who hits the proxy on its raw port — a caller can spoof an arbitrary org_id into both inference-proxy structured logs and vLLM logs. Effects:

  • Pollute another tenant's usage view in Datadog (@org_id:<victim> queries return the attacker's requests)
  • Hide their own requests by setting an unowned org_id
  • Confuse any downstream pipeline that ingests vLLM logs and trusts the header

Why this isn't fixable today

cloud-api's POST /v1/check_api_key returns only HTTP status (200/402/429/5xx) — see src/auth.rs lines 84–180, which drains the body without parsing it. inference-proxy has no way to learn the authoritative org_id from the API key it just validated, so it can't override the inbound header with a server-side truth value.

Options (rough preference order)

  1. Have cloud-api return org_id in the /v1/check_api_key response body, then have inference-proxy use that as the source of truth and either ignore inbound X-Org-Id for sk- requests or hard-fail on mismatch. This is the only fix that survives a misconfigured network ACL.

  2. Document the network-ACL assumption in the middleware (`src/lib.rs::request_id_middleware` doc comment) so a future operator doesn't open up direct ingress without thinking about it.

  3. For admin-token (config.token) requests, trust the header (these come from a trusted gateway). For sk- requests with no cloud-api-supplied org_id, refuse to log/forward the org_id at all.

Acceptance

  • inference-proxy logs and vLLM logs carry an org_id that the caller cannot spoof
  • ...or, if we accept the network-ACL approach, the trust boundary is explicitly documented and there's a CI/runtime check that the proxy isn't reachable from outside cloud-api's network

Context

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions