App Version: 3.3.1
Platform: Both (provider-independent; affects any text-only model not in the pi-ai registry)
Bug Description
When using a text-only model that isn't in pi-ai's registry (e.g. deepseek-v4-pro via Ollama Cloud), any request that carries image content — screenshots from the GUI/computer-use tools, or pasted images — fails with a provider 400, surfaced in the UI as an opaque "请求被上游拒绝(400)… invalid message format" / "invalid message format".
Steps to Reproduce
- Configure provider Ollama Cloud, model
deepseek-v4-pro, base URL https://ollama.com/v1.
- Run a cowork task where the agent captures a screenshot (or otherwise include an image in the turn).
- The request fails with a 400.
Expected Behaviour
Images are dropped for a text-only model and the request succeeds as a text-only request.
Actual Behaviour
Hard 400 from the provider. Direct reproduction against the endpoint:
POST https://ollama.com/v1/chat/completions (image_url content, model deepseek-v4-pro)
HTTP 400 {"error":"this model does not support image input"}
The same model/endpoint returns 200 for text-only requests (verified with system + user, tool round-trips, streaming, reasoning_effort, etc.).
Root Cause
deepseek-v4-pro isn't in pi-ai's registry, so the app builds a synthetic model (log: [ClaudeAgentRunner] Model not in pi-ai registry, using synthetic model: deepseek-v4-pro → openai-completions). buildSyntheticPiModel hard-codes:
// src/main/claude/pi-model-resolution.ts
input: ['text', 'image'],
Because the synthetic model advertises image support, the openai-completions provider's convertMessages does not filter image content (it only filters when !model.input.includes("image")). So image blocks are sent to a text-only endpoint → 400. The app then shows its generic "invalid message format".
Suggested Fix
Default synthetic models to input: ['text']. We can't know whether an arbitrary model supports vision, and a false vision claim hard-fails the whole request, whereas text-only merely drops images gracefully. Vision-capable models resolved from the pi-ai registry keep their real modalities; only synthetic fallbacks change. A PR implementing this follows.
Found alongside #248 / #249.
App Version: 3.3.1
Platform: Both (provider-independent; affects any text-only model not in the pi-ai registry)
Bug Description
When using a text-only model that isn't in pi-ai's registry (e.g.
deepseek-v4-provia Ollama Cloud), any request that carries image content — screenshots from the GUI/computer-use tools, or pasted images — fails with a provider 400, surfaced in the UI as an opaque "请求被上游拒绝(400)… invalid message format" / "invalid message format".Steps to Reproduce
deepseek-v4-pro, base URLhttps://ollama.com/v1.Expected Behaviour
Images are dropped for a text-only model and the request succeeds as a text-only request.
Actual Behaviour
Hard 400 from the provider. Direct reproduction against the endpoint:
The same model/endpoint returns 200 for text-only requests (verified with system + user, tool round-trips, streaming, reasoning_effort, etc.).
Root Cause
deepseek-v4-proisn't in pi-ai's registry, so the app builds a synthetic model (log:[ClaudeAgentRunner] Model not in pi-ai registry, using synthetic model: deepseek-v4-pro → openai-completions).buildSyntheticPiModelhard-codes:Because the synthetic model advertises image support, the
openai-completionsprovider'sconvertMessagesdoes not filter image content (it only filters when!model.input.includes("image")). So image blocks are sent to a text-only endpoint → 400. The app then shows its generic "invalid message format".Suggested Fix
Default synthetic models to
input: ['text']. We can't know whether an arbitrary model supports vision, and a false vision claim hard-fails the whole request, whereas text-only merely drops images gracefully. Vision-capable models resolved from the pi-ai registry keep their real modalities; only synthetic fallbacks change. A PR implementing this follows.Found alongside #248 / #249.