Conversation
…ization
Demonstrates the vulnerability: when streamed tool call arguments form
valid JSON before all chunks have arrived (e.g., {"query": "test"} when
the full args are {"query": "test", "limit": 10}), the tool call is
finalized prematurely with incomplete arguments.
This test is expected to FAIL until the fix is applied.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…le partial JSON Streaming tool call arguments were finalized using isParsableJson() as a heuristic for completion. If partial accumulated JSON happened to be valid JSON before all chunks arrived, the tool call would be executed with incomplete arguments (e.g., missing safety-relevant parameters). Move tool call finalization from inline isParsableJson checks to the flush() handler, which only runs after the stream is fully consumed. This ensures tool calls always receive complete arguments. Affected providers: openai, openai-compatible, groq, deepseek, alibaba. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Collaborator
Author
|
This should be good to go now. I also wondered if we can get rid of the code duplication, I created a draft PR of how that could look like: |
felixarntz
approved these changes
Mar 5, 2026
9 tasks
vercel-ai-sdk bot
pushed a commit
that referenced
this pull request
Mar 5, 2026
…le partial JSON (#13137) ## Background In 5 OpenAI-compatible providers (`openai`, `openai-compatible`, `groq`, `deepseek`, `alibaba`), streaming tool call arguments were finalized using `isParsableJson()` as a heuristic for completion. If partial accumulated JSON happened to be valid JSON before all chunks arrived, the tool call would be executed with incomplete arguments, potentially dropping safety-relevant parameters. For example, a model streaming `{"query": "test", "limit": 10}` might produce `{"query": "test"}` as an intermediate value — which is valid JSON. The tool call would be finalized early with only `{"query": "test"}`, and the `"limit": 10` parameter would be silently lost. Providers not affected: `anthropic` (uses explicit `content_block_stop` events), OpenAI Responses API (different protocol). ## Summary - Removed all inline `isParsableJson()` checks from the streaming tool call delta handlers across 5 providers - Moved tool call finalization to the `flush()` handler, which only runs after the stream is fully consumed, ensuring tool calls always receive complete arguments - Added a regression test demonstrating the vulnerability (first commit fails against unfixed code, second commit fixes it) - Updated existing test snapshots to reflect the new ordering (tool calls finalize in `flush()` after text-end, rather than inline during streaming) ## Manual Verification The first commit (`573a744`) adds the regression test only — it fails against the unfixed code, proving the vulnerability exists: ``` - Expected: input: '{"query": "test"}, "limit": 10}' + Received: input: '{"query": "test"}' ``` The second commit (`7ae73dd`) applies the fix, making all tests pass. ## Checklist - [x] Tests have been added / updated (for bug fixes / features) - [ ] Documentation has been added / updated (for bug fixes / features) - [x] A _patch_ changeset for relevant packages has been added (for bug fixes / features - run `pnpm changeset` in the project root) - [x] I have reviewed this pull request (self-review) ## Related Issues VULN-6861 --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Contributor
|
✅ Backport PR created: #13145 |
lgrammel
reviewed
Mar 6, 2026
| // Regression test: if streamed tool call arguments form valid JSON before | ||
| // all chunks have arrived, the tool call must NOT be finalized early. | ||
| // For example, {"query": "test"} is valid JSON but the full args are | ||
| // {"query": "test", "limit": 10}. Finalizing early would lose "limit". |
Collaborator
There was a problem hiding this comment.
this argument is not correct
Contributor
|
🚀 Published in:
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Background
In 5 OpenAI-compatible providers (
openai,openai-compatible,groq,deepseek,alibaba), streaming tool call arguments were finalized usingisParsableJson()as a heuristic for completion. If partial accumulated JSON happened to be valid JSON before all chunks arrived, the tool call would be executed with incomplete arguments, potentially dropping safety-relevant parameters.For example, a model streaming
{"query": "test", "limit": 10}might produce{"query": "test"}as an intermediate value — which is valid JSON. The tool call would be finalized early with only{"query": "test"}, and the"limit": 10parameter would be silently lost.Providers not affected:
anthropic(uses explicitcontent_block_stopevents), OpenAI Responses API (different protocol).Summary
isParsableJson()checks from the streaming tool call delta handlers across 5 providersflush()handler, which only runs after the stream is fully consumed, ensuring tool calls always receive complete argumentsflush()after text-end, rather than inline during streaming)Manual Verification
The first commit (
573a744) adds the regression test only — it fails against the unfixed code, proving the vulnerability exists:The second commit (
7ae73dd) applies the fix, making all tests pass.Checklist
pnpm changesetin the project root)Related Issues
VULN-6861