What happened?
The Bedrock passthrough routes under /bedrock/model/{modelId}/... do not enforce the models allowlist configured on the API key (key.models) or on the user (user.models). A virtual key scoped to a specific set of models can freely call any Bedrock model via these routes.
This is particularly impactful for Claude Code CLI users, because Claude Code has a first-class Bedrock integration (CLAUDE_CODE_USE_BEDROCK=1 + AWS_BEARER_TOKEN_BEDROCK=<litellm_virtual_key>) that hits this exact path — so a key intended to grant access to only a limited set of models can be used to call frontier Bedrock models (e.g. claude-opus-4-7) without any warning.
Repro
Tested against a deployment running v1.83.3-stable.patch.2. The code path exists unchanged on main (HEAD as of this writing).
Setup
- Define a model named
claude-opus-4-7 in config.yaml pointing at bedrock/global.anthropic.claude-opus-4-7.
- Create a virtual key whose
models list does not include claude-opus-4-7 (e.g. it only contains an access group that does not cover this model).
Step 1 — Standard routes correctly block (baseline)
All three standard entrypoints enforce can_key_call_model and return 401 key_model_access_denied, as expected:
KEY="<restricted virtual key>"
# /chat/completions
curl -sS -X POST https://<proxy>/chat/completions \
-H "Authorization: Bearer $KEY" -H "Content-Type: application/json" \
-d '{"model":"claude-opus-4-7","messages":[{"role":"user","content":"hi"}],"max_tokens":5}'
{"error":{"message":"key not allowed to access model. This key can only access models=['common-models']. Tried to access claude-opus-4-7","type":"key_model_access_denied","param":"model","code":"401"}}
# /v1/messages
curl -sS -X POST https://<proxy>/v1/messages \
-H "x-api-key: $KEY" -H "Content-Type: application/json" -H "anthropic-version: 2023-06-01" \
-d '{"model":"claude-opus-4-7","messages":[{"role":"user","content":"hi"}],"max_tokens":5}'
→ same 401 key_model_access_denied.
# /anthropic/v1/messages
curl -sS -X POST https://<proxy>/anthropic/v1/messages \
-H "x-api-key: $KEY" -H "Content-Type: application/json" -H "anthropic-version: 2023-06-01" \
-d '{"model":"claude-opus-4-7","messages":[{"role":"user","content":"hi"}],"max_tokens":5}'
→ same 401 key_model_access_denied.
Step 2 — Bedrock passthrough routes bypass the ACL
All four Bedrock passthrough variants return 200 OK with a real response from the restricted model. Same key, same model — only the route is different:
# /bedrock/model/{modelId}/invoke → HTTP 200
curl -sS -X POST \
"https://<proxy>/bedrock/model/global.anthropic.claude-opus-4-7/invoke" \
-H "Authorization: Bearer $KEY" -H "Content-Type: application/json" \
-d '{"anthropic_version":"bedrock-2023-05-31","messages":[{"role":"user","content":"hi"}],"max_tokens":5}'
{"model":"claude-opus-4-7","id":"msg_bdrk_...","type":"message","role":"assistant","content":[{"type":"text","text":"Hi there!"}],"stop_reason":"max_tokens","usage":{"input_tokens":13,"output_tokens":5,...}}
# /bedrock/model/{modelId}/invoke-with-response-stream → HTTP 200, event-stream
curl -sS -X POST \
"https://<proxy>/bedrock/model/global.anthropic.claude-opus-4-7/invoke-with-response-stream" \
-H "Authorization: Bearer $KEY" -H "Content-Type: application/json" \
-d '{"anthropic_version":"bedrock-2023-05-31","messages":[{"role":"user","content":"hi"}],"max_tokens":5}'
First decoded event chunk:
{"type":"message_start","message":{"model":"claude-opus-4-7","id":"msg_bdrk_...","role":"assistant","usage":{"input_tokens":13,"output_tokens":4,...}}}
# /bedrock/model/{modelId}/converse → HTTP 200
curl -sS -X POST \
"https://<proxy>/bedrock/model/global.anthropic.claude-opus-4-7/converse" \
-H "Authorization: Bearer $KEY" -H "Content-Type: application/json" \
-d '{"messages":[{"role":"user","content":[{"text":"hi"}]}],"inferenceConfig":{"maxTokens":5}}'
{"output":{"message":{"content":[{"text":"Hi there!"}],"role":"assistant"}},"stopReason":"max_tokens","usage":{"inputTokens":13,"outputTokens":5,"totalTokens":18}}
# /bedrock/model/{modelId}/converse-stream → HTTP 200, event-stream
curl -sS -X POST \
"https://<proxy>/bedrock/model/global.anthropic.claude-opus-4-7/converse-stream" \
-H "Authorization: Bearer $KEY" -H "Content-Type: application/json" \
-d '{"messages":[{"role":"user","content":[{"text":"hi"}]}],"inferenceConfig":{"maxTokens":5}}'
→ HTTP 200, streamed response from the restricted Claude Opus model.
Expected for all four: 401 key_model_access_denied.
Actual: 200 OK, a real response from the restricted model is returned and spend is logged against the virtual key. The key.models / user.models allowlist is never consulted.
The spend log entry for a bypass call (fields trimmed) clearly shows the call succeeded as allm_passthrough_route without any ACL interception:
{
"call_type": "allm_passthrough_route",
"api_key": "<hashed virtual key>",
"api_base": "https://bedrock-runtime.us-east-1.amazonaws.com/model/global.anthropic.claude-opus-4-7/invoke-with-response-stream",
"model": "bedrock/global.anthropic.claude-opus-4-7",
"model_group": "claude-opus-4-7"
}
Root cause
Model access enforcement lives in _enforce_key_and_fallback_model_access in litellm/proxy/auth/user_api_key_auth.py, which only invokes can_key_call_model(model=...) when get_model_from_request() returns a non-None model.
get_model_from_request() in litellm/proxy/auth/auth_utils.py extracts model from:
request_body["model"]
/openai/deployments/{model}/*
/v1beta/models/{model}:* (Google)
/vertex_ai/.../models/{model} (Vertex)
…but not from the Bedrock path pattern /bedrock/model/{modelId}/(invoke|invoke-with-response-stream|converse|converse-stream). The native Bedrock InvokeModel / Converse bodies do not contain a top-level model field either — the model lives exclusively in the URL.
Consequently get_model_from_request returns None for these requests, _enforce_key_and_fallback_model_access silently skips can_key_call_model, and the request is forwarded to AWS.
Note: there is already an _extract_model_from_bedrock_endpoint helper inside litellm/proxy/pass_through_endpoints/llm_passthrough_endpoints.py used by the Bedrock proxy handler itself, but it is not consulted by the auth layer.
Impact
Any deployment that relies on key.models or user.models to restrict which models a virtual key may call has a silent ACL bypass whenever the client can reach /bedrock/*. This includes:
- Multi-tenant proxies that give different customers different model allowlists.
- Teams that restrict expensive frontier models (Claude Opus, etc.) to specific keys — a determined user can still reach them via the Bedrock passthrough.
- Claude Code CLI users targeting LiteLLM via
CLAUDE_CODE_USE_BEDROCK=1, which is the default path for Bedrock integration.
There is no config flag to force access control on passthrough routes.
Suggested fix
Two parts:
-
Extend get_model_from_request in litellm/proxy/auth/auth_utils.py to parse the Bedrock route patterns below and return the Bedrock modelId (including the application-inference-profile variant).
-
Resolve the Bedrock modelId back to the LiteLLM model_group name (via llm_router.get_model_list() — the router already knows which deployment corresponds to which model_group) so the existing can_key_call_model check works unchanged against user-configured model_names. If no mapping exists, fall back to the raw modelId and let can_key_call_model decide.
Route patterns to cover:
/bedrock/model/{modelId}/invoke
/bedrock/model/{modelId}/invoke-with-response-stream
/bedrock/model/{modelId}/converse
/bedrock/model/{modelId}/converse-stream
/bedrock/model/application-inference-profile/{profileId}/{action}
Happy to open a PR implementing this (with unit tests covering all route variants and router reverse-lookup).
Are you a ML Ops Team?
Running a multi-tenant LiteLLM proxy in production.
What LiteLLM version are you on?
v1.83.3-stable.patch.2 — the relevant code path in auth_utils.py is unchanged on current main.
Twitter / LinkedIn details
N/A
What happened?
The Bedrock passthrough routes under
/bedrock/model/{modelId}/...do not enforce themodelsallowlist configured on the API key (key.models) or on the user (user.models). A virtual key scoped to a specific set of models can freely call any Bedrock model via these routes.This is particularly impactful for Claude Code CLI users, because Claude Code has a first-class Bedrock integration (
CLAUDE_CODE_USE_BEDROCK=1+AWS_BEARER_TOKEN_BEDROCK=<litellm_virtual_key>) that hits this exact path — so a key intended to grant access to only a limited set of models can be used to call frontier Bedrock models (e.g.claude-opus-4-7) without any warning.Repro
Tested against a deployment running
v1.83.3-stable.patch.2. The code path exists unchanged onmain(HEAD as of this writing).Setup
claude-opus-4-7inconfig.yamlpointing atbedrock/global.anthropic.claude-opus-4-7.modelslist does not includeclaude-opus-4-7(e.g. it only contains an access group that does not cover this model).Step 1 — Standard routes correctly block (baseline)
All three standard entrypoints enforce
can_key_call_modeland return401 key_model_access_denied, as expected:{"error":{"message":"key not allowed to access model. This key can only access models=['common-models']. Tried to access claude-opus-4-7","type":"key_model_access_denied","param":"model","code":"401"}}→ same 401
key_model_access_denied.→ same 401
key_model_access_denied.Step 2 — Bedrock passthrough routes bypass the ACL
All four Bedrock passthrough variants return
200 OKwith a real response from the restricted model. Same key, same model — only the route is different:{"model":"claude-opus-4-7","id":"msg_bdrk_...","type":"message","role":"assistant","content":[{"type":"text","text":"Hi there!"}],"stop_reason":"max_tokens","usage":{"input_tokens":13,"output_tokens":5,...}}First decoded event chunk:
{"type":"message_start","message":{"model":"claude-opus-4-7","id":"msg_bdrk_...","role":"assistant","usage":{"input_tokens":13,"output_tokens":4,...}}}{"output":{"message":{"content":[{"text":"Hi there!"}],"role":"assistant"}},"stopReason":"max_tokens","usage":{"inputTokens":13,"outputTokens":5,"totalTokens":18}}→
HTTP 200, streamed response from the restricted Claude Opus model.Expected for all four: 401
key_model_access_denied.Actual: 200 OK, a real response from the restricted model is returned and spend is logged against the virtual key. The
key.models/user.modelsallowlist is never consulted.The spend log entry for a bypass call (fields trimmed) clearly shows the call succeeded as
allm_passthrough_routewithout any ACL interception:{ "call_type": "allm_passthrough_route", "api_key": "<hashed virtual key>", "api_base": "https://bedrock-runtime.us-east-1.amazonaws.com/model/global.anthropic.claude-opus-4-7/invoke-with-response-stream", "model": "bedrock/global.anthropic.claude-opus-4-7", "model_group": "claude-opus-4-7" }Root cause
Model access enforcement lives in
_enforce_key_and_fallback_model_accessinlitellm/proxy/auth/user_api_key_auth.py, which only invokescan_key_call_model(model=...)whenget_model_from_request()returns a non-Nonemodel.get_model_from_request()inlitellm/proxy/auth/auth_utils.pyextractsmodelfrom:request_body["model"]/openai/deployments/{model}/*/v1beta/models/{model}:*(Google)/vertex_ai/.../models/{model}(Vertex)…but not from the Bedrock path pattern
/bedrock/model/{modelId}/(invoke|invoke-with-response-stream|converse|converse-stream). The native BedrockInvokeModel/Conversebodies do not contain a top-levelmodelfield either — the model lives exclusively in the URL.Consequently
get_model_from_requestreturnsNonefor these requests,_enforce_key_and_fallback_model_accesssilently skipscan_key_call_model, and the request is forwarded to AWS.Note: there is already an
_extract_model_from_bedrock_endpointhelper insidelitellm/proxy/pass_through_endpoints/llm_passthrough_endpoints.pyused by the Bedrock proxy handler itself, but it is not consulted by the auth layer.Impact
Any deployment that relies on
key.modelsoruser.modelsto restrict which models a virtual key may call has a silent ACL bypass whenever the client can reach/bedrock/*. This includes:CLAUDE_CODE_USE_BEDROCK=1, which is the default path for Bedrock integration.There is no config flag to force access control on passthrough routes.
Suggested fix
Two parts:
Extend
get_model_from_requestinlitellm/proxy/auth/auth_utils.pyto parse the Bedrock route patterns below and return the BedrockmodelId(including theapplication-inference-profilevariant).Resolve the Bedrock
modelIdback to the LiteLLMmodel_groupname (viallm_router.get_model_list()— the router already knows which deployment corresponds to whichmodel_group) so the existingcan_key_call_modelcheck works unchanged against user-configuredmodel_names. If no mapping exists, fall back to the rawmodelIdand letcan_key_call_modeldecide.Route patterns to cover:
/bedrock/model/{modelId}/invoke/bedrock/model/{modelId}/invoke-with-response-stream/bedrock/model/{modelId}/converse/bedrock/model/{modelId}/converse-stream/bedrock/model/application-inference-profile/{profileId}/{action}Happy to open a PR implementing this (with unit tests covering all route variants and router reverse-lookup).
Are you a ML Ops Team?
Running a multi-tenant LiteLLM proxy in production.
What LiteLLM version are you on?
v1.83.3-stable.patch.2— the relevant code path inauth_utils.pyis unchanged on currentmain.Twitter / LinkedIn details
N/A