feat(reasoning): bucket max_tokens to effort on adaptive Opus by steebchen · Pull Request #2753 · theopenco/llmgateway

steebchen · 2026-06-19T15:07:53Z

Summary

Originating from a support question: a user reported reasoning "no longer activates" for claude-opus-4-6 and that the docs don't list it as a reasoning model. Reasoning is working correctly for Opus 4.6 — verified live against both Anthropic and AWS Bedrock. The confusion stems from adaptive thinking behavior plus stale docs. This PR improves the reasoning.max_tokens handling for adaptive models and rewrites the docs.

Behavior change: bucket `reasoning.max_tokens` → effort on adaptive models

Adaptive Claude Opus models (4.6+) use Anthropic's adaptive thinking and reject an explicit budget_tokens. Previously a bare reasoning.max_tokens was accepted but the budget was silently dropped (the model just ran adaptive at default depth).

We considered returning a 4xx, but that would break budget-based clients — notably Claude Code, whose /v1/messages thinking:{type:"enabled",budget_tokens:N} is mapped to reasoning.max_tokens precisely so it survives the Anthropic→OpenAI translation. Instead, we now bucket the requested budget into an adaptive effort level so it still influences depth:

`reasoning.max_tokens`	adaptive effort
`< 2000`	`low`
`< 8000`	`medium`
`< 24000`	`high`
`≥ 24000`	`xhigh`

Precedence is unchanged: explicit effort → reasoning_effort/reasoning.effort → bucketed max_tokens. The shared resolution logic is factored into one resolveAdaptiveEffort helper, replacing the two duplicated mapEffort closures in the Anthropic and Bedrock branches.

Docs (`features/reasoning.mdx`)

Stop enumerating a hardcoded "supported models" list (goes stale); point to the models page reasoning filter instead.
Add a dedicated Adaptive Thinking (Claude Opus) section explaining that Opus 4.6+ don't honor an exact reasoning.max_tokens budget (it's bucketed to effort) and that effort is a hint — the model may dynamically reason briefly (or start immediately) even at high effort. This is expected, not a disabled-reasoning bug.

Verification

Unit prepare-request-body.adaptive.spec.ts — 20/20 pass (new cases cover each budget→effort bucket for Opus 4.6/4.7/4.8, and explicit effort winning over a budget). Full prepare-request-body suite: 118/118.
Live e2e chat-reasoning.e2e.ts with TEST_MODELS="anthropic/claude-opus-4-6,aws-bedrock/claude-opus-4-6" — 4/4 pass. Anthropic-direct returns reasoning (reasoning_tokens: 69), Bedrock returns the reasoning summary; all upstream calls 200.
pnpm format + docs build pass.

🤖 Generated with Claude Code

Summary by CodeRabbit

Documentation
- Expanded the reasoning docs to cover additional Anthropic Claude reasoning-enabled models.
- Updated reasoning.max_tokens guidance with a new “Adaptive Thinking (Claude Opus)” section, clarifying effort-based depth control.
Improvements
- Enhanced adaptive-thinking request building to derive depth/effort from reasoning_effort or, when absent, from bucketed reasoning_max_tokens.
- Ensures reasoning_effort overrides reasoning_max_tokens, and reasoning_effort: "none" disables thinking.
Tests
- Added coverage for adaptive models where only reasoning_max_tokens is set, plus regression tests for override and disable behavior.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

coderabbitai · 2026-06-19T15:08:09Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 4c87160a-df3d-4017-895a-ac73a7d232a8

📥 Commits

Reviewing files that changed from the base of the PR and between 9f9e017 and bc81872.

📒 Files selected for processing (2)

packages/actions/src/prepare-request-body.adaptive.spec.ts
packages/actions/src/prepare-request-body.ts

🚧 Files skipped from review as they are similar to previous changes (2)

packages/actions/src/prepare-request-body.adaptive.spec.ts
packages/actions/src/prepare-request-body.ts

Walkthrough

Adds a resolveAdaptiveEffort() helper function that converts reasoning_effort and reasoning_max_tokens into Anthropic adaptive-thinking effort levels, applies it to both Anthropic and AWS Bedrock request builders, and expands reasoning documentation with updated model support and adaptive thinking guidance.

Changes

Adaptive thinking effort resolution

Layer / File(s)	Summary
Reasoning documentation expansion `apps/docs/content/features/reasoning.mdx`	Broadens "Reasoning-Enabled Models" list from single Claude 3.7 Sonnet to general Claude Sonnet and Claude Opus. Updates `reasoning.max_tokens` supported models section with guidance that newer Claude Opus models use adaptive thinking and map the budget to effort levels instead. Adds new "Adaptive Thinking (Claude Opus)" section describing adaptive behavior, effort control via `reasoning_effort` / `reasoning.effort`, and backward-compatibility mapping of `reasoning.max_tokens`.
Adaptive effort resolution helper `packages/actions/src/prepare-request-body.ts`	Introduces `AdaptiveEffort` type and `resolveAdaptiveEffort()` helper that derives adaptive-thinking effort (`low`, `medium`, `high`, `xhigh`, `max`) by checking explicit `effort` parameter first, then `reasoning_effort` mapping, then `reasoning_max_tokens` bucket thresholds, returning `undefined` if none apply.
Adaptive thinking request construction and tests `packages/actions/src/prepare-request-body.adaptive.spec.ts`, `packages/actions/src/prepare-request-body.ts`	Adds test cases validating that `reasoning_max_tokens` is converted to `output_config.effort` levels and that `reasoning_effort` takes precedence. Extends `reasoning_effort` type to include `"none"` for disabling thinking. Updates Anthropic adaptive-thinking path to use `resolveAdaptiveEffort()` when effort is not explicitly set. Updates AWS Bedrock adaptive-thinking path to use `resolveAdaptiveEffort()` instead of local mapping logic.

Sequence Diagram

sequenceDiagram
  participant RequestBuilder
  participant resolveAdaptiveEffort
  participant Anthropic as Anthropic Provider
  participant Bedrock as AWS Bedrock Provider
  RequestBuilder->>resolveAdaptiveEffort: reasoning_effort, reasoning_max_tokens, effort
  resolveAdaptiveEffort->>resolveAdaptiveEffort: Check effort > reasoning_effort > reasoning_max_tokens
  resolveAdaptiveEffort->>RequestBuilder: AdaptiveEffort (low|medium|high|xhigh|max|undefined)
  RequestBuilder->>Anthropic: output_config.effort
  RequestBuilder->>Bedrock: output_config.effort

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

theopenco/llmgateway#2034: Both PRs modify packages/actions/src/prepare-request-body.ts to change Bedrock Anthropic "adaptive thinking" request assembly—specifically how reasoning_effort/reasoning_max_tokens are mapped into output_config.effort for thinking.type = "adaptive".
theopenco/llmgateway#2550: Both PRs modify packages/actions/src/prepare-request-body.ts and its adaptive-thinking request construction (plus prepare-request-body.adaptive.spec.ts) to map Anthropic Opus adaptive inputs—especially reasoning_max_tokens/reasoning_effort into output_config.effort and thinking: { type: "adaptive" }.
theopenco/llmgateway#2558: Main PR extends Anthropic adaptive request building by deriving thinking.output_config.effort from reasoning_max_tokens/reasoning_effort, while the retrieved PR adds the inverse gateway mapping that translates Anthropic thinking/output_config.effort into unified reasoning.effort/reasoning.max_tokens—both directly cover the same Anthropic reasoning-effort fields, just on opposite request-building/parsing paths.

Suggested reviewers

smakosh
proxysoul

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 75.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The PR title clearly identifies the main change: introducing bucketing of `reasoning.max_tokens` to adaptive effort levels for Claude Opus models, which is the core behavioral addition.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch opus-4-6-reasoning-regression

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 2b0d3e89e3

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-06-19T15:09:32Z

 The `reasoning.max_tokens` parameter is supported by:

- **Anthropic Claude**: Claude 3.7 Sonnet, Claude Sonnet 4, Claude Opus 4, Claude Opus 4.5
+- **Anthropic Claude**: Claude 3.7 Sonnet, Claude Sonnet 4, Claude Opus 4, Claude Opus 4.5, Claude Opus 4.6, Claude Opus 4.7, Claude Opus 4.8


Remove adaptive Opus from max_tokens support

For Claude Opus 4.6/4.7/4.8 requests that include reasoning.max_tokens, the gateway still rejects named/root models unless some provider mapping has reasoningMaxTokens === true (apps/gateway/src/chat/tools/validate-model-capabilities.ts), and the Opus 4.6/4.7/4.8 Anthropic mappings only set reasoningMode: "adaptive" without that flag. In practice anthropic/claude-opus-4-6 with reasoning.max_tokens returns the documented 400 rather than being accepted, so listing these models here (and saying below that the value is accepted) sends users toward a failing request path.

Useful? React with 👍 / 👎.

Adaptive Claude Opus models (4.6+) reject an explicit thinking budget, so a bare reasoning.max_tokens was previously accepted but dropped. Map it onto an adaptive effort level (<2k low, <8k medium, <24k high, else xhigh) so the requested budget still influences depth while keeping budget-based clients (e.g. Claude Code) working. Factor the shared effort resolution into one helper and document adaptive thinking. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 9f9e017202

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-06-19T15:45:15Z

+						const adaptiveEffort = resolveAdaptiveEffort(
+							undefined,
+							reasoning_effort,
+							reasoning_max_tokens,
+						);


Route adaptive max-token requests past validation

Fresh evidence in this patch is the new reasoning_max_tokens bucketing call here, but gateway traffic still cannot reach it for claude-opus-4-6/4-7/4-8: validateModelCapabilities() only accepts reasoning.max_tokens when a mapping has reasoningMaxTokens === true, and the adaptive Opus mappings only declare reasoningMode: "adaptive"; auto-routing has the same provider.reasoningMaxTokens !== true filter. As a result requests like model: "anthropic/claude-opus-4-8" with reasoning.max_tokens are rejected/filtered before prepareRequestBody can translate the budget into output_config.effort.

Useful? React with 👍 / 👎.

The shared effort switch fell through 'none' to a default of "high", which would force strong reasoning if 'none' ever reached it (it's normalized away upstream for Anthropic/Bedrock today). Handle 'none' explicitly as no effort and drop the misleading catch-all; the switch is now exhaustive. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: bc81872b46

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-06-19T15:52:32Z

+						if (adaptiveEffort !== undefined) {
+							requestBody.output_config ??= {};
+							requestBody.output_config.effort = adaptiveEffort;


Preserve bucketed effort with JSON schema output

When an adaptive Anthropic request includes both reasoning.max_tokens and response_format: { type: "json_schema" }, this new bucketed output_config.effort is assigned here but is later replaced wholesale by the JSON-schema output_config block, so the requested budget no longer influences adaptive depth even though the PR’s main behavior change is to avoid silently dropping it. This affects structured-output calls to Opus 4.6+ that rely on reasoning.max_tokens; the later merge should keep the existing effort while adding format.

Useful? React with 👍 / 👎.

docs(reasoning): document Opus 4.6/4.7/4.8 adaptive reasoning

2b0d3e8

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

chatgpt-codex-connector Bot reviewed Jun 19, 2026

View reviewed changes

steebchen changed the title ~~docs(reasoning): document Opus 4.6/4.7/4.8 adaptive reasoning~~ feat(reasoning): bucket max_tokens to effort on adaptive Opus Jun 19, 2026

chatgpt-codex-connector Bot reviewed Jun 19, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(reasoning): bucket max_tokens to effort on adaptive Opus#2753

feat(reasoning): bucket max_tokens to effort on adaptive Opus#2753
steebchen wants to merge 3 commits into
mainfrom
opus-4-6-reasoning-regression

steebchen commented Jun 19, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 19, 2026 •

edited

Loading

❌ Failed checks (1 warning)

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Jun 19, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Jun 19, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Jun 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

steebchen commented Jun 19, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Behavior change: bucket reasoning.max_tokens → effort on adaptive models

Docs (features/reasoning.mdx)

Verification

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Suggested reviewers

❌ Failed checks (1 warning)

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 19, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 19, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 19, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

steebchen commented Jun 19, 2026 •

edited by coderabbitai Bot

Loading

Behavior change: bucket `reasoning.max_tokens` → effort on adaptive models

Docs (`features/reasoning.mdx`)

coderabbitai Bot commented Jun 19, 2026 •

edited

Loading