Hello, I have been trying to evaluate modern multimodal models using your framework. I have tested gemini-1.5-flash (from Google AI) and mistralai/mistral-small-3.1-24b-instruct:free (from OpenRouter).
In both cases, the API call is successful, but the evaluation fails with the error: Error parsing tool call: 'image'.
This suggests that the project's agent logic cannot parse the tool-calling output format for image-related tasks from these models. The parsing seems to be tightly coupled to the original Qwen model's output format.
Hello, I have been trying to evaluate modern multimodal models using your framework. I have tested gemini-1.5-flash (from Google AI) and mistralai/mistral-small-3.1-24b-instruct:free (from OpenRouter).
In both cases, the API call is successful, but the evaluation fails with the error: Error parsing tool call: 'image'.
This suggests that the project's agent logic cannot parse the tool-calling output format for image-related tasks from these models. The parsing seems to be tightly coupled to the original Qwen model's output format.