Skip to content

Fix: Claude Code VS Code extension fails to parse responses when every chunk includes usage data#3670

Open
zhaohuiweixiao wants to merge 3 commits intohigress-group:mainfrom
zhaohuiweixiao:fix_chunkUsage
Open

Fix: Claude Code VS Code extension fails to parse responses when every chunk includes usage data#3670
zhaohuiweixiao wants to merge 3 commits intohigress-group:mainfrom
zhaohuiweixiao:fix_chunkUsage

Conversation

@zhaohuiweixiao
Copy link
Copy Markdown

@zhaohuiweixiao zhaohuiweixiao commented Apr 1, 2026

Ⅰ. Describe what this PR did

Fix bug: Claude Code VS Code extension fails to parse responses when every chunk includes usage data.
image

In the ai-proxy plugin, the logic for converting OpenAI protocol to Claude protocol treats any chunk containing usage as the end of the stream.Therefore, when the model returns every chunk with usage, it prematurely emits a message_stop event, causing parsing failures in the Claude Code VS Code extension.

model output example:
`data: {"id":"019d3d99245a5dd32971f17e72e2e4e3","object":"chat.completion.chunk","created":1774854940,"model":"Minimax-M2.5","choices":[{"index":0,"delta":{"role":"assistant","content":""}}],"system_fingerprint":"","usage":{"prompt_tokens":40,"completion_tokens":0,"total_tokens":40,"prompt_tokens_details":{},"completion_tokens_details":{}}}

data: {"id":"019d3d99245a5dd32971f17e72e2e4e3","object":"chat.completion.chunk","created":1774854940,"model":"Minimax-M2.5","choices":[{"index":0,"delta":{"role":"assistant","content":"","reasoning_content":"用户"}}],"system_fingerprint":"","usage":{"prompt_tokens":40,"completion_tokens":1,"total_tokens":41,"prompt_tokens_details":{},"completion_tokens_details":{"reasoning_tokens":1}}}

data: {"id":"019d3d99245a5dd32971f17e72e2e4e3","object":"chat.completion.chunk","created":1774854940,"model":"Minimax-M2.5","choices":[{"index":0,"delta":{"role":"assistant","content":"","reasoning_content":"用"}}],"system_fingerprint":"","usage":{"prompt_tokens":40,"completion_tokens":2,"total_tokens":42,"prompt_tokens_details":{},"completion_tokens_details":{"reasoning_tokens":2}}}`

ai-proxy output:
image

Ⅱ. Does this pull request fix one issue?

yes

Ⅲ. Why don't you add test cases (unit test/integration test)?

Ⅳ. Describe how to verify it

Use whether the chunk contains a finish_reason as the end-of-stream indicator.

Ⅴ. Special notes for reviews

Ⅵ. AI Coding Tool Usage Checklist (if applicable)

Please check all applicable items:

  • For new standalone features (e.g., new wasm plugin or golang-filter plugin):

    • I have created a design/ directory in the plugin folder
    • I have added the design document to the design/ directory
    • I have included the AI Coding summary below
  • For regular updates/changes (not new plugins):

    • I have provided the prompts/instructions I gave to the AI Coding tool below
    • I have included the AI Coding summary below

AI Coding Prompts (for regular updates)

AI Coding Summary

@zhaohuiweixiao
Copy link
Copy Markdown
Author

zhaohuiweixiao commented Apr 1, 2026

@CH3CHO


@CH3CHO

Comment thread plugins/wasm-go/extensions/ai-proxy/provider/claude_to_openai.go Outdated
@CH3CHO
Copy link
Copy Markdown
Collaborator

CH3CHO commented Apr 1, 2026

这个改法不是很合适。部分大模型服务在处理包含 "stream_options": { "include_usage": true } 的流式请求时,会在原本包含 "finish_reason":"stop" 的 chunk 后补充发送 usage chunk。现在的改法可能会有逻辑问题。

流式响应示例(https://api.moonshot.cn/v1/chat/completions + moonshot-v1-8k):

data: {"id":"chatcmpl-69ccb4efd68e87fdb659f8c9","object":"chat.completion.chunk","created":1775023343,"model":"moonshot-v1-8k","choices":[{"index":0,"delta":{"content":"。"},"finish_reason":null}],"system_fingerprint":"fpv0_3f4f71b6"}

data: {"id":"chatcmpl-69ccb4efd68e87fdb659f8c9","object":"chat.completion.chunk","created":1775023343,"model":"moonshot-v1-8k","choices":[{"index":0,"delta":{},"finish_reason":"stop","usage":{"prompt_tokens":12,"completion_tokens":7,"total_tokens":19}}],"system_fingerprint":"fpv0_3f4f71b6"}

data: {"id":"chatcmpl-69ccb4efd68e87fdb659f8c9","object":"chat.completion.chunk","created":1775023343,"model":"moonshot-v1-8k","choices":[],"usage":{"prompt_tokens":12,"completion_tokens":7,"total_tokens":19}}

data: [DONE]

This change is not very appropriate. When some large model services process streaming requests containing "stream_options": { "include_usage": true }, they will send a usage chunk after the chunk that originally contained "finish_reason":"stop". There may be logical problems with the current reform.

Streaming response example (https://api.moonshot.cn/v1/chat/completions + moonshot-v1-8k):

data: {"id":"chatcmpl-69ccb4efd68e87fdb659f8c9","object":"chat.completion.chunk","created":1775023343,"model":"moonshot -v1-8k","choices":[{"index":0,"delta":{"content":""},"finish_reason":null}],"system_fingerprint":"fpv0_3f4f71b6"}

data: {"id":"chatcmpl-69ccb4efd68e87fdb659f8c9","object":"chat.completion.chunk","created":1775023343,"model":"moonshot-v1-8k","choices":[{"index":0 ,"delta":{},"finish_reason":"stop","usage":{"prompt_tokens":12,"completion_tokens":7,"total_tokens":19}}],"system_fingerprint":"fpv0_3f4f71b6"}

data: {"id":"chatcmpl-69ccb4efd68e87fdb659f8c9","object":"chat.completion.chunk","created":1775023343,"mode l":"moonshot-v1-8k","choices":[],"usage":{"prompt_tokens":12,"completion_tokens":7,"total_tokens":19}}

data: [DONE]

…y chunk includes usage data

Signed-off-by: zhaohuihui <zhaohuihui_yewu@cmss.chinamobile.com>
@zhaohuiweixiao
Copy link
Copy Markdown
Author

已修改为DONE时发送message_stop event

@CH3CHO
Copy link
Copy Markdown
Collaborator

CH3CHO commented Apr 2, 2026

已修改为DONE时发送message_stop event

会不会有的场景下,服务端没有返回 [DONE]?需要进一步分析代码并测试验证一下。

https://chat.deepseek.com/share/5zr9crsqyr04fyjj4q

@zhaohuiweixiao
Copy link
Copy Markdown
Author

已修改为DONE时发送message_stop event

会不会有的场景下,服务端没有返回 [DONE]?需要进一步分析代码并测试验证一下。

https://chat.deepseek.com/share/5zr9crsqyr04fyjj4q

是要测试一下这个文档中说的这些主流实现是否都返回DONE吗?
vLLM、LocalAI、Llama.cpp:遵循 OpenAI 规范,在流结束时返回 [DONE]。

HuggingFace TGI:发送 [DONE]\n(带换行符),可能导致解析问题,但已修复。

FastChat:遵循 OpenAI 规范,支持流式输出。

Ollama:不发送 [DONE],而是在流式响应的最后一个 JSON 块中将 "done": true 作为标志。

Copy link
Copy Markdown
Collaborator

@CH3CHO CH3CHO left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Apr 9, 2026

Codecov Report

❌ Patch coverage is 0% with 5 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
...o/extensions/ai-proxy/provider/claude_to_openai.go 0.00% 5 Missing ⚠️

📢 Thoughts on this report? Let us know!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants