Skip to content

fix: Bedrock tool name encoding#3706

Merged
joeyorlando merged 3 commits intomainfrom
codex/bedrock-tool-name-limit-clean
Apr 11, 2026
Merged

fix: Bedrock tool name encoding#3706
joeyorlando merged 3 commits intomainfrom
codex/bedrock-tool-name-limit-clean

Conversation

@joeyorlando
Copy link
Copy Markdown
Contributor

@joeyorlando joeyorlando commented Apr 10, 2026

Summary

Fixes Bedrock Converse requests that fail when MCP tool names exceed Bedrock's provider-side tool name limit.

  • Adds provider-facing Bedrock tool name encoding that shortens names over 64 characters with a stable hash suffix.
  • Applies the encoded name consistently to tool definitions, explicit tool choices, and assistant tool-use history sent back to Bedrock.
  • Decodes Bedrock tool call responses back to the original MCP tool names so policy evaluation, logging, and execution continue to use the real tool names.
  • Preserves the existing Nova hyphen normalization behavior and adds collision handling for names that normalize to the same provider value.

Root Cause

Bedrock validates toolConfig.tools[*].toolSpec.name before model execution and rejects names longer than its limit. Some fully qualified MCP tool names can exceed that provider limit even though the same tool works through other providers.

@joeyorlando joeyorlando changed the title [codex] fix Bedrock tool name encoding fix: Bedrock tool name encoding Apr 10, 2026
@joeyorlando joeyorlando marked this pull request as ready for review April 10, 2026 18:37
@claude
Copy link
Copy Markdown
Contributor

claude bot commented Apr 10, 2026

Claude finished @joeyorlando's task in 3m 19s —— View job


PR Review: Bedrock tool name encoding

Reviewed the changes to bedrock.ts, bedrock.test.ts, llm-proxy-handler.ts, and bedrock/index.ts.

  • Read CLAUDE.md for repo-specific guidelines
  • Read changed files
  • Analyze tool name encoding logic
  • Check edge cases and collision handling
  • Verify test coverage
  • Post review feedback

Summary

Solid fix. The PR correctly addresses Bedrock's 64-char tool name limit via deterministic hash-suffix truncation, applies the encoding consistently across tool definitions, toolChoice, and assistant tool-use history, and decodes names back in responses/streams so downstream policy/log/exec code sees the original MCP tool names. The Nova hyphen normalization is preserved, and a collision path for normalized names is added. Tests cover the main happy paths.

Nothing blocking — a few observations and minor nits below.


Observations

1. Hash-suffix collisions are not retried (bedrock.ts:208-223, bedrock.ts:233-244)

getUniqueProviderToolName handles collisions on the pre-hash encoded name by calling appendToolNameHash(encodedName, originalName). However, the result of appendToolNameHash itself is not re-checked against usedNames. In the extremely unlikely case that two different original names produce the same 8-hex-char sha256 prefix and truncate to the same 55-char prefix, a silent collision would occur (last writer wins in toOriginal.set, and decoding would map Bedrock's response to the wrong original name).

With 32 bits of hash entropy, the birthday bound is ~65K tools before ~50% chance of any collision, so this is largely theoretical. Still, a defensive loop (or while (usedNames.has(candidate)) bump-salt) would make this robust without changing the stable-name property for the common case. Example sketch:

let candidate = encodedName;
if (usedNames.get(candidate) && usedNames.get(candidate) !== originalName) {
  candidate = appendToolNameHash(encodedName, originalName);
  // Optional: guard against hash-suffix collisions too
  let salt = 0;
  while (usedNames.has(candidate) && usedNames.get(candidate) !== originalName) {
    candidate = appendToolNameHash(encodedName, `${originalName}#${++salt}`);
  }
}

2. bedrockCommandContextCache provides no practical dedup (bedrock.ts:90-93, bedrock.ts:1376-1432)

The WeakMap is keyed on the request object identity. In practice:

  • llm-proxy-handler.ts:203 calls createStreamAdapter(body) with the original body.
  • Later, llm-proxy-handler.ts:557 creates finalRequest = requestAdapter.toProviderRequest() (a new object via spread).
  • executeStream(client, finalRequest) / execute(client, finalRequest) then call buildBedrockCommandContext(finalRequest) — a different object reference → cache miss.

So for streaming requests the context is built twice (once for the mapping in the stream adapter, once for the actual send), and for non-streaming it's built once (the createStreamAdapter(body) mapping is never used since non-streaming doesn't stream). The build is cheap and deterministic, so this is just a minor inefficiency — not a correctness bug — but the caching intent isn't actually realized. Consider either:

  • Building the mapping directly and caching on the BedrockRequestAdapter instance, then passing it to the stream adapter and execute paths via the adapter, or
  • Dropping the WeakMap and noting that the cost is negligible.

3. BedrockRequestAdapter now always builds a mapping (bedrock.ts:379-382)

Previously the mapping was only built for Nova models. Now it's built for every Bedrock request. This is correct (since non-Nova models can also exceed the 64-char limit), and the cost is O(tools) with small constants, so fine. Worth noting in case there's a very hot path — a quick early-exit when no tool name needs encoding could skip map allocation entirely, but probably not worth it.

4. Exporting getCommandInput purely for tests (bedrock.ts:1439)

CLAUDE.md has a "No Premature Exports" rule. This export is only used in bedrock.test.ts. That's a common test-only export pattern, but worth calling out to confirm intent — alternatively, the test could exercise the public API via bedrockAdapterFactory.execute with a fake client (as it already does in one test) to avoid the test-only export.

5. Decoding path for assistant history in BedrockRequestAdapter (bedrock.ts:525-542, bedrock.ts:623-642)

findToolName / findToolNameInMessages both run decodeToolName on the toolUse name pulled from this.request.messages. These messages come from the client and should contain original tool names (not encoded), because encoding only happens on the wire to Bedrock in buildBedrockCommandContext. decodeToolName falls back to returning the input when no mapping exists, so this is benign. However, on retry turns where the client echoes back a name that Bedrock previously returned (already decoded), everything still matches. Worth a comment to make this invariant explicit; the current code works but is non-obvious.

6. Test coverage gaps

The new tests are focused and well-written. A few cases that would increase confidence:

  • Non-Nova truncation for a name whose normalized length equals exactly 64 (off-by-one sanity check).
  • Non-Nova collision: two distinct long names that share a common 55-char prefix, to exercise appendToolNameHash on non-Nova.
  • A toolChoice with a name that gets truncated (the new encodeProviderToolChoiceName path for the truncation-only branch is only incidentally covered).
  • A streaming test where toolNameMapping is populated via createStreamAdapter(request) for a non-Nova model (the existing stream test happens to use the default claude model so it works, but asserting that the mapping is actually populated for a non-Nova long-name case would pin the behavior change).

7. isToolUseBlock and isToolResultBlock type guards in encodeProviderMessageToolNames

The spread of contentBlock.toolUse works, but be aware that if contentBlock has extra fields beyond toolUse, they are preserved via ...contentBlock. This is correct; just noting that the as BedrockMessages cast at the end bypasses type checking for any field shape drift — consider typing the returned array more strictly if/when convenient.


Minor nits

  • Doc comment stale (bedrock.ts:148-152): The JSDoc still says "Encode only the provider-facing names and keep mappings to restore originals", which is accurate, but the function itself now has two responsibilities (hyphen normalization + truncation). A one-line note about truncation would help future readers.
  • TOOL_NAME_HASH_LENGTH = 8: Consider a brief comment explaining the chosen hash length and the implied collision bound, since this is the main tradeoff.
  • createEmptyToolNameMapping() (bedrock.ts:201-206): Only used once as a field initializer. Could be inlined, but keeping it as a named helper is fine for readability.

Correctness & security

  • No SSRF/injection concerns introduced — all encoding is purely name normalization.
  • Hash is sha256 (appropriate; collision resistance is inherited from the full hash even though only 8 chars are retained).
  • No secrets logged; debug logs use the decoded/original names as before.
  • The WeakMap cannot leak memory (weak keys).
  • Tool name encoding is deterministic per-request, so multi-turn conversations will produce consistent encoded names.

Verdict

LGTM with optional hardening around hash-suffix collision retry and a suggestion to either realize or drop the bedrockCommandContextCache. Nothing in this PR should block merge.

Fix collision retry →
• Branch: codex/bedrock-tool-name-limit-clean

@joeyorlando joeyorlando merged commit 0e2c2d1 into main Apr 11, 2026
38 checks passed
@joeyorlando joeyorlando deleted the codex/bedrock-tool-name-limit-clean branch April 11, 2026 02:44
joeyorlando added a commit that referenced this pull request Apr 11, 2026
🤖 I have created a release *beep* *boop*
---


##
[1.2.10](platform-v1.2.9...platform-v1.2.10)
(2026-04-11)


### Features

* make posthog analytics configurable
([#3707](#3707))
([7ae9101](7ae9101))


### Bug Fixes

* `/llm/costs` table scrolling
([#3722](#3722))
([6a42ba8](6a42ba8))
* apply MCP OAuth lifetime for gateway slugs
([#3711](#3711))
([362aaec](362aaec))
* Bedrock tool name encoding
([#3706](#3706))
([0e2c2d1](0e2c2d1))
* costs timeframes and surface limit reset settings
([#3709](#3709))
([6e4154b](6e4154b))
* jira oauth discovery overrides
([#3721](#3721))
([2c4cf8f](2c4cf8f))
* OIDC discovery trusted origins for IdP registration
([#3714](#3714))
([adb5f5e](adb5f5e))
* preserve shared chat agents on fork
([#3715](#3715))
([252edfc](252edfc))
* reranker model dropdown labels
([#3704](#3704))
([ebd1c8a](ebd1c8a))
* session logs loading state
([#3712](#3712))
([ffba126](ffba126))


### Miscellaneous Chores

* **ci:** add ID-JAG MCP e2e test
([#3702](#3702))
([1a5078a](1a5078a))
* **deps:** bump next from 16.1.7 to 16.2.3 in /platform/frontend
([#3708](#3708))
([d47967c](d47967c))
* use neutral token prefixes with legacy support
([#3719](#3719))
([db5929c](db5929c))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

Co-authored-by: archestra-ci[bot] <222894074+archestra-ci[bot]@users.noreply.github.com>
Co-authored-by: Joey Orlando <joey@archestra.ai>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant