Skip to content
Open
Show file tree
Hide file tree
Changes from 6 commits
Commits
Show all changes
16 commits
Select commit Hold shift + click to select a range
3ca6f16
Add configurable Bedrock endpoint URLs via env vars
weisser-dev Mar 4, 2026
d578baf
Merge pull request #1 from HUK-COBURG/codex/add-vic-endpoint-support
weisser-dev Mar 4, 2026
9fec98b
Add model whitelist configuration for Bedrock model exposure
weisser-dev Mar 5, 2026
5e5fc72
Merge pull request #2 from HUK-COBURG/codex/create-model-whitelist-co…
weisser-dev Mar 5, 2026
a607cc7
Fix Bedrock endpoint env defaults and cache whitelist loading
weisser-dev Mar 6, 2026
13eab9e
Merge pull request #3 from HUK-COBURG/codex/fix-service-crash-due-to-…
weisser-dev Mar 6, 2026
80c068a
Fix Sonnet 4.5/4.6 Bedrock validation errors
weisser-dev Mar 16, 2026
3929f94
Merge pull request #4 from HUK-COBURG/codex/investigate-error-400-in-…
weisser-dev Mar 16, 2026
fcf14b6
Harden whitelist loading and parameterize Bedrock endpoint URLs
weisser-dev Mar 16, 2026
bfa7203
Merge pull request #5 from HUK-COBURG/codex/address-critical-and-impo…
weisser-dev Mar 16, 2026
82b4fc7
Fix Bedrock Qwen stopSequences ValidationException
weisser-dev Mar 17, 2026
e476d05
Merge pull request #6 from HUK-COBURG/codex/fix-bedrock-validation-er…
weisser-dev Mar 17, 2026
b5a73b5
Merge pull request #7 from aws-samples/main
weisser-dev Mar 17, 2026
0f1716e
Harden Bedrock request validation and model compatibility
weisser-dev Mar 17, 2026
02f317f
Merge pull request #8 from HUK-COBURG/codex/fix-bedrock-validation-er…
weisser-dev Mar 17, 2026
1540612
Merge branch 'aws-samples:main' into main
cniweb Jun 12, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -155,6 +155,9 @@ Now, you can try out the proxy APIs. Let's say you want to test Claude 3 Sonnet
```bash
export OPENAI_API_KEY=<API key>
export OPENAI_BASE_URL=<API base url>
# Optional: use VPC interface endpoints / custom Bedrock endpoints
# export BEDROCK_URL=https://vpce-xxxxxxxx.bedrock.<region>.vpce.amazonaws.com
# export BEDROCK_RUNTIME_URL=https://vpce-xxxxxxxx.bedrock-runtime.<region>.vpce.amazonaws.com
# For older versions
# https://github.com/openai/openai-python/issues/624
export OPENAI_API_BASE=<API base url>
Expand Down
2 changes: 2 additions & 0 deletions deployment/BedrockProxy.template
Original file line number Diff line number Diff line change
Expand Up @@ -85,6 +85,8 @@ Resources:
ENABLE_CROSS_REGION_INFERENCE: "true"
ENABLE_APPLICATION_INFERENCE_PROFILES: "true"
ENABLE_PROMPT_CACHING: !Ref EnablePromptCaching
BEDROCK_URL: ""
BEDROCK_RUNTIME_URL: ""
Comment thread
weisser-dev marked this conversation as resolved.
Outdated
API_ROUTE_PREFIX: /v1
MemorySize: 1024
PackageType: Image
Expand Down
4 changes: 4 additions & 0 deletions deployment/BedrockProxyFargate.template
Original file line number Diff line number Diff line change
Expand Up @@ -261,6 +261,10 @@ Resources:
- Name: ENABLE_PROMPT_CACHING
Value:
Ref: EnablePromptCaching
- Name: BEDROCK_URL
Value: ""
- Name: BEDROCK_RUNTIME_URL
Value: ""
Essential: true
Image:
Ref: ContainerImageUri
Expand Down
2 changes: 2 additions & 0 deletions docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,8 @@ services:
- "127.0.0.1:8000:8080"
environment:
- ENABLE_PROMPT_CACHING=true
- BEDROCK_URL=${BEDROCK_URL:-}
- BEDROCK_RUNTIME_URL=${BEDROCK_RUNTIME_URL:-}
Comment on lines +12 to +13

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: ${BEDROCK_URL:-} sets an empty string when unset, not unset.

This works today because _env_or_none() in setting.py converts "" to None. But it's a subtle coupling — if _env_or_none were ever changed, the docker-compose default would break.

Consider using ${BEDROCK_URL-} (without :) so the env var remains unset inside the container when not defined on the host, rather than being set to an empty string. This removes the dependency on _env_or_none() for correctness.

- API_KEY=${OPENAI_API_KEY}
- AWS_PROFILE
- AWS_ACCESS_KEY_ID
Expand Down
22 changes: 22 additions & 0 deletions docs/Usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,9 @@ Assuming you have set up below environment variables after deployed:
```bash
export OPENAI_API_KEY=<API key>
export OPENAI_BASE_URL=<API base url>
# Optional: use VPC interface endpoints / custom Bedrock endpoints
# export BEDROCK_URL=https://vpce-xxxxxxxx.bedrock.<region>.vpce.amazonaws.com
# export BEDROCK_RUNTIME_URL=https://vpce-xxxxxxxx.bedrock-runtime.<region>.vpce.amazonaws.com
```

**API Example:**
Expand All @@ -23,6 +26,25 @@ You can use this API to get a list of supported model IDs.

Also, you can use this API to refresh the model list if new models are added to Amazon Bedrock.

You can optionally restrict which models are exposed by `/models` and accepted by chat requests using a whitelist JSON config:

```bash
export MODEL_WHITELIST_FILE=/app/config/model-whitelist.json
# or inline JSON
# export MODEL_WHITELIST_JSON='{"families":["anthropic.claude","amazon.nova"],"profile_regions":["us","global"],"model_ids":["arn:aws:bedrock:us-west-2:123456789012:application-inference-profile/my-profile"]}'
```

Example `model-whitelist.json`:

```json
{
"families": ["anthropic.claude", "amazon.nova"],
"profile_regions": ["us", "global"],
"model_ids": [
"arn:aws:bedrock:us-west-2:123456789012:application-inference-profile/my-profile"
]
}
```

**Example Request**

Expand Down
3 changes: 3 additions & 0 deletions docs/Usage_CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,9 @@
```bash
export OPENAI_API_KEY=<API key>
export OPENAI_BASE_URL=<API base url>
# 可选:使用 VPC Interface Endpoint 或自定义 Bedrock Endpoint
# export BEDROCK_URL=https://vpce-xxxxxxxx.bedrock.<region>.vpce.amazonaws.com
# export BEDROCK_RUNTIME_URL=https://vpce-xxxxxxxx.bedrock-runtime.<region>.vpce.amazonaws.com
```

**API 示例:**
Expand Down
68 changes: 68 additions & 0 deletions src/api/models/bedrock.py
Original file line number Diff line number Diff line change
Expand Up @@ -44,8 +44,12 @@
)
from api.setting import (
AWS_REGION,
BEDROCK_RUNTIME_URL,
BEDROCK_URL,
DEBUG,
DEFAULT_MODEL,
MODEL_WHITELIST_FILE,
MODEL_WHITELIST_JSON,
ENABLE_CROSS_REGION_INFERENCE,
ENABLE_APPLICATION_INFERENCE_PROFILES,
ENABLE_PROMPT_CACHING,
Expand All @@ -66,11 +70,13 @@
bedrock_runtime = boto3.client(
service_name="bedrock-runtime",
region_name=AWS_REGION,
endpoint_url=BEDROCK_RUNTIME_URL,
config=config,
)
bedrock_client = boto3.client(
service_name="bedrock",
region_name=AWS_REGION,
endpoint_url=BEDROCK_URL,
config=config,
)

Expand Down Expand Up @@ -107,6 +113,59 @@
}


def _load_model_whitelist() -> dict:
"""Load model whitelist config from env JSON string or JSON file."""
if MODEL_WHITELIST_JSON:
try:
return json.loads(MODEL_WHITELIST_JSON)
except json.JSONDecodeError as e:
logger.warning("Invalid MODEL_WHITELIST_JSON. Ignoring whitelist. error=%s", e)
return {}

if MODEL_WHITELIST_FILE:
try:
with open(MODEL_WHITELIST_FILE, encoding="utf-8") as f:
return json.load(f)
except Exception as e:
logger.warning("Unable to load MODEL_WHITELIST_FILE=%s. Ignoring whitelist. error=%s", MODEL_WHITELIST_FILE, e)
return {}

return {}
Comment thread
weisser-dev marked this conversation as resolved.


def _is_allowed_by_whitelist(model_id: str, whitelist: dict) -> bool:
"""Check if model id is allowed by whitelist rules.

Supported keys:
- model_ids: exact model ids/profile ids
- families: prefix match for foundation model ids (e.g. anthropic.claude, amazon.nova)
- profile_regions: prefix before first '.' for cross-region profiles (e.g. us, eu, apac, global)
"""
if not whitelist:
return True

model_ids = set(whitelist.get("model_ids", []))
families = whitelist.get("families", [])
profile_regions = set(whitelist.get("profile_regions", []))

if model_ids and model_id in model_ids:
return True

if families and any(model_id.startswith(family) for family in families):
return True

if profile_regions and "." in model_id:
prefix = model_id.split(".", 1)[0]
if prefix in profile_regions:
return True

# If any selector is configured, default deny.
return not any((model_ids, families, profile_regions))
Comment thread
weisser-dev marked this conversation as resolved.


_MODEL_WHITELIST: dict = _load_model_whitelist()


def list_bedrock_models() -> dict:
"""Automatically getting a list of supported models.

Expand All @@ -116,6 +175,7 @@ def list_bedrock_models() -> dict:
- Application Inference Profiles (if enabled via Env)
"""
model_list = {}
whitelist = _MODEL_WHITELIST
try:
if ENABLE_CROSS_REGION_INFERENCE:
# List system defined inference profile IDs and store underlying model mapping
Expand Down Expand Up @@ -203,6 +263,14 @@ def list_bedrock_models() -> dict:
# In case stack not updated.
model_list[DEFAULT_MODEL] = {"modalities": ["TEXT", "IMAGE"]}

if whitelist:
model_list = {
model_id: metadata
for model_id, metadata in model_list.items()
if _is_allowed_by_whitelist(model_id, whitelist)
}
logger.info("Applied model whitelist, allowed_models=%d", len(model_list))
Comment thread
weisser-dev marked this conversation as resolved.
Outdated

return model_list


Expand Down
4 changes: 4 additions & 0 deletions src/api/setting.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,8 +11,12 @@

DEBUG = os.environ.get("DEBUG", "false").lower() != "false"
AWS_REGION = os.environ.get("AWS_REGION", "us-west-2")
BEDROCK_URL = os.environ.get("BEDROCK_URL") or None
BEDROCK_RUNTIME_URL = os.environ.get("BEDROCK_RUNTIME_URL") or None
DEFAULT_MODEL = os.environ.get("DEFAULT_MODEL", "anthropic.claude-3-sonnet-20240229-v1:0")
DEFAULT_EMBEDDING_MODEL = os.environ.get("DEFAULT_EMBEDDING_MODEL", "cohere.embed-multilingual-v3")
ENABLE_CROSS_REGION_INFERENCE = os.environ.get("ENABLE_CROSS_REGION_INFERENCE", "true").lower() != "false"
ENABLE_APPLICATION_INFERENCE_PROFILES = os.environ.get("ENABLE_APPLICATION_INFERENCE_PROFILES", "true").lower() != "false"
ENABLE_PROMPT_CACHING = os.environ.get("ENABLE_PROMPT_CACHING", "false").lower() != "false"
MODEL_WHITELIST_FILE = os.environ.get("MODEL_WHITELIST_FILE")
MODEL_WHITELIST_JSON = os.environ.get("MODEL_WHITELIST_JSON")
Comment thread
weisser-dev marked this conversation as resolved.
Outdated