fix(llm): map max_completion_tokens to max_output_tokens for Responses API by truffle-dev · Pull Request #438 · HKUDS/DeepTutor

truffle-dev · 2026-05-03T02:20:00Z

What

When the Responses API path is selected for newer OpenAI models (gpt-5.x, o1, o3, o4) and an extra kwarg of `max_completion_tokens` flows in, the OpenAI SDK raises `TypeError` from `responses.create()` before any HTTP request leaves the client. Closes #437.

Why

`get_token_limit_kwargs(model, n)` returns `{"max_completion_tokens": n}` for newer models (the Chat Completions name). Both `OpenAICompatProvider` and `AzureOpenAIProvider` route those kwargs through `client.responses.create()` when `_should_use_responses_api()` matches, but the Responses API only accepts `max_output_tokens`. The SDK rejects the unknown kwarg with `TypeError` inside the client, and `_should_fallback_from_responses_error()` only catches HTTP errors with `status_code` — so the call is never retried via Chat Completions.

The reporter in #437 hit this with `LLM_BINDING = openai` and `LLM_MODEL = gpt-5.5` on v1.3.5, which routes through `OpenAICompatProvider`.

How

Add `adapt_chat_kwargs_to_responses()` to `provider_core/openai_responses/converters.py` and use it at all four merge sites: `chat` and `chat_stream` on both `OpenAICompatProvider` (lines 698, 781) and `AzureOpenAIProvider` (lines 126, 154).

The helper:

preserves the existing `None`-drop semantics of the previous dict-comprehension merge,
maps `max_completion_tokens` → `max_output_tokens`,
only applies the alias when `max_output_tokens` is not explicitly set by the caller,
does not mutate the input.

Tests

`tests/services/llm/test_openai_responses_converters.py` covers seven cases: passthrough, None-drop, the rename, None for the alias, explicit-name precedence, empty input, and input non-mutation. `pytest tests/services/llm/test_openai_responses_converters.py -v` is green; the rest of `tests/services/llm/` and `tests/services/config/test_llm_probe_config.py` pass with no new failures from this branch (75/76; the one unrelated failure is a missing `data/user/settings/agents.yaml` fixture).

Note on #390

There is an existing open PR (#390, "[codex] Normalize Azure OpenAI max token aliases") that addresses a related problem at the factory layer for Azure only by translating to `max_tokens`. The reproduction in #437 uses `LLM_BINDING = openai`, which goes through `OpenAICompatProvider` and is outside that PR's scope. Happy to defer if a different shape is preferred.

…s API When the Responses API path is selected for newer OpenAI models (gpt-5.x, o1, o3, o4) and an extra kwarg of `max_completion_tokens` flows in (typically from `get_token_limit_kwargs(model, n)`), the OpenAI SDK raises `TypeError` from `responses.create()` before any HTTP request leaves the client. `_should_fallback_from_responses_error()` only catches HTTP errors that carry a `status_code`, so the call is never retried via Chat Completions; the user just sees a stack trace. The bug exists on both OpenAICompatProvider and AzureOpenAIProvider, in the chat and chat_stream paths (4 sites total). Add `adapt_chat_kwargs_to_responses()` that translates the alias to `max_output_tokens` and is used at all four merge sites. The helper: - preserves the existing `None`-drop semantics of the previous dict-comprehension merge, - only applies the alias when `max_output_tokens` is not explicitly set by the caller, - does not mutate the input. Closes HKUDS#437.

pancacake · 2026-05-03T03:58:01Z

Thanks for your contribution!

pancacake merged commit abe8020 into HKUDS:dev May 3, 2026
7 checks passed

truffle-dev mentioned this pull request May 3, 2026

[Bug]: AsyncResponses.create() got an unexpected keyword argument 'max_completion_tokens' when using OpenAI gpt-5.x models #437

Closed

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(llm): map max_completion_tokens to max_output_tokens for Responses API#438

fix(llm): map max_completion_tokens to max_output_tokens for Responses API#438
pancacake merged 1 commit into
HKUDS:devfrom
truffle-dev:fix/responses-api-max-completion-tokens-mapping

truffle-dev commented May 3, 2026

Uh oh!

pancacake commented May 3, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

truffle-dev commented May 3, 2026

What

Why

How

Tests

Note on #390

Uh oh!

pancacake commented May 3, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants