Skip to content

litellm oss branch#26386

Open
krrish-berri-2 wants to merge 1 commit intolitellm_internal_stagingfrom
litellm_oss_branch
Open

litellm oss branch#26386
krrish-berri-2 wants to merge 1 commit intolitellm_internal_stagingfrom
litellm_oss_branch

Conversation

@krrish-berri-2
Copy link
Copy Markdown
Contributor

…5855)

Bedrock enforces non-increasing TTL ordering across cache_control blocks (tools → system → messages). The tool cache_control TTL was being unconditionally dropped to the default 5m, while system blocks preserved the user-specified TTL for Claude 4.5+ models. This mismatch caused "a ttl='1h' block must not come after a ttl='5m' block" errors when users set ttl='1h' on both tools and system.

Converse path: add_cache_point_tool_block() now accepts a model param and preserves TTL for Claude 4.5+, matching _get_cache_point_block().

Invoke path: _remove_ttl_from_cache_control() now also processes tools (was only processing system and messages).

Relevant issues

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

  • I have Added testing in the tests/test_litellm/ directory, Adding at least 1 test is a hard requirement - see details
  • My PR passes all unit tests on make test-unit
  • My PR's scope is as isolated as possible, it only solves 1 specific problem
  • I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 before requesting a maintainer review

Delays in PR merge?

If you're seeing a delay in your PR being merged, ping the LiteLLM Team on Slack (#pr-review).

CI (LiteLLM team)

CI status guideline:

  • 50-55 passing tests: main is stable with minor issues.
  • 45-49 passing tests: acceptable but needs attention
  • <= 40 passing tests: unstable; be careful with your merges and assess the risk.
  • Branch creation CI run
    Link:

  • CI run for the last commit
    Link:

  • Merge / cherry-pick CI run
    Links:

Screenshots / Proof of Fix

Type

🆕 New Feature
🐛 Bug Fix
🧹 Refactoring
📖 Documentation
🚄 Infrastructure
✅ Test

Changes

…5855)

Bedrock enforces non-increasing TTL ordering across cache_control blocks
(tools → system → messages). The tool cache_control TTL was being
unconditionally dropped to the default 5m, while system blocks preserved
the user-specified TTL for Claude 4.5+ models. This mismatch caused
"a ttl='1h' block must not come after a ttl='5m' block" errors when
users set ttl='1h' on both tools and system.

Converse path: add_cache_point_tool_block() now accepts a model param
and preserves TTL for Claude 4.5+, matching _get_cache_point_block().

Invoke path: _remove_ttl_from_cache_control() now also processes tools
(was only processing system and messages).

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@krrish-berri-2 krrish-berri-2 temporarily deployed to integration-postgres April 24, 2026 04:55 — with GitHub Actions Inactive
@krrish-berri-2 krrish-berri-2 temporarily deployed to integration-postgres April 24, 2026 04:55 — with GitHub Actions Inactive
@krrish-berri-2 krrish-berri-2 temporarily deployed to integration-postgres April 24, 2026 04:55 — with GitHub Actions Inactive
@krrish-berri-2 krrish-berri-2 temporarily deployed to integration-postgres April 24, 2026 04:55 — with GitHub Actions Inactive
@veria-ai
Copy link
Copy Markdown

veria-ai Bot commented Apr 24, 2026

Low: No security issues found

This PR adds TTL support for Bedrock cache control on tool blocks, extending existing cache point logic to pass TTL values for Claude 4.5+ models. TTL values are validated against a strict allowlist ("5m", "1h"), and the sanitization logic in _remove_ttl_from_cache_control is extended to also cover tools — which is a defensive improvement. No authentication, authorization, or input validation boundaries are affected.


Status: 0 open
Risk: 1/10

Posted by Veria AI · 2026-04-24T04:55:28.314Z

@krrish-berri-2 krrish-berri-2 temporarily deployed to integration-postgres April 24, 2026 04:55 — with GitHub Actions Inactive
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 24, 2026

Greptile Summary

This PR fixes a Bedrock prompt-caching TTL ordering violation: tool cache_control blocks were always emitted as {"type": "default"} (5-minute TTL), while system blocks could carry a 1h TTL for Claude 4.5+ models, causing Bedrock to reject requests with "a ttl='1h' block must not come after a ttl='5m' block". The fix threads the model parameter through add_cache_point_tool_block / _bedrock_tools_pt (converse path) and extends _remove_ttl_from_cache_control to also sanitize tools entries (invoke path).

Confidence Score: 5/5

Safe to merge — the fix is correct, targeted, and well-tested; the one finding is a pre-existing style concern (P2).

All blocking logic is correct: TTL is preserved in the right order for both the converse and invoke paths, parameter propagation is consistent, and four new focused unit tests validate the behaviour. The sole finding (hardcoded model-detection function) is P2 and pre-dates this PR; no P0/P1 issues were found.

No files require special attention beyond the pre-existing is_claude_4_5_on_bedrock hardcoded pattern noted in factory.py.

Important Files Changed

Filename Overview
litellm/litellm_core_utils/prompt_templates/factory.py add_cache_point_tool_block and _bedrock_tools_pt extended to accept model param and preserve TTL for Claude 4.5+ on the converse path; relies on hardcoded is_claude_4_5_on_bedrock which violates project model-detection policy.
litellm/llms/bedrock/chat/converse_transformation.py Both _bedrock_tools_pt call sites now forward model=model, correctly threading the model name through to the TTL-preservation logic.
litellm/llms/bedrock/messages/invoke_transformations/anthropic_claude3_transformation.py _remove_ttl_from_cache_control now processes tools before system/messages, matching Bedrock's non-increasing TTL ordering constraint; logic is correct and consistent with system/messages handling.
tests/test_litellm/litellm_core_utils/prompt_templates/test_litellm_core_utils_prompt_templates_factory.py Two new unit tests cover TTL preservation for Claude 4.5, TTL stripping for older models, no-model default, and end-to-end _bedrock_tools_pt behavior. All mock-only, correct per project test rules.
tests/test_litellm/llms/bedrock/messages/invoke_transformations/test_anthropic_claude3_transformation.py Two new unit tests verify _remove_ttl_from_cache_control strips tool TTL for older models and preserves it for Claude 4.5. Well-structured, no real network calls.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[User request with tools + system\ncache_control ttl='1h'] --> B{Path?}

    B -->|Converse| C["_bedrock_tools_pt(tools, model=model)"]
    C --> D["add_cache_point_tool_block(tool, model=model)"]
    D --> E{is_claude_4_5_on_bedrock?}
    E -->|Yes| F["cachePoint: {type: default, ttl: '1h'}"]
    E -->|No| G["cachePoint: {type: default}"]

    B -->|Invoke| H["_remove_ttl_from_cache_control(request, model)"]
    H --> I[Process tools FIRST]
    I --> J[Process system]
    J --> K[Process messages]
    I --> L{is_claude_4_5?}
    L -->|Yes + valid ttl| M[Preserve ttl in cache_control]
    L -->|No| N[Strip ttl from cache_control]

    F --> O[Non-increasing TTL order satisfied\ntools >= system >= messages]
    M --> O
Loading

Reviews (1): Last reviewed commit: "fix(bedrock): preserve cache_control TTL..." | Re-trigger Greptile

Comment on lines +5073 to +5078
if (
ttl in ["5m", "1h"]
and model is not None
and is_claude_4_5_on_bedrock(model)
):
cache_point_block["ttl"] = ttl
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Hardcoded model-specific flag — violates project rule

is_claude_4_5_on_bedrock is a hardcoded list of model-name patterns. The project policy requires model capability flags to live in model_prices_and_context_window.json and be read via get_model_info, so that adding a new Claude 4.x model that supports TTL does not require a code change or a LiteLLM upgrade. The analogous _get_cache_point_block path already calls the same hardcoded function, so this PR extends the pattern rather than introducing it — but each new call site makes the problem harder to fix later.

Rule Used: What: Do not hardcode model-specific flags in the ... (source)

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 24, 2026

Codecov Report

❌ Patch coverage is 95.00000% with 1 line in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
...llm/litellm_core_utils/prompt_templates/factory.py 92.85% 1 Missing ⚠️

📢 Thoughts on this report? Let us know!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants