[Bug] Can't train Qwen3.5 or Gemma4 on multimodal datasets in Unsloth Studio

Note: Please do not remove the questions. Answer beside them.
1. Did you update? `pip install --upgrade unsloth unsloth_zoo`
I am running a Docker image with tag 2026.4.2-pt2.9.0-vllm-0.16.0-cu12.8-studio-release-v0.1.35-beta
2. `Colab` or `Kaggle` or local / cloud
Local
3. Number GPUs used, use `nvidia-smi`
A single H100 GPU
4. Which notebook? Please link!
Unsloth Studio
5. Which Unsloth version, TRL version, transformers version, PyTorch version?
Unsloth 2026.4.2, TRL 0.23.1, Transformers 4.57.1, PyTorch 2.9.1+cu128
6. Which trainer? `SFTTrainer`, `GRPOTrainer` etc
I assume it is SFT by default, but there is no visualization in Studio.

I tried my custom .parquet dataset with images and text columns, as well as **unsloth/LaTeX_OCR** example, but both of them produce the same ```Text model is not compatible with a multimodal dataset. Switch to a vision model or choose a text-only dataset.``` on **unsloth/Qwen3.5-4B** and **unsloth/gemma-4-E2B-it** even though they both have vision capabilities. However, **unsloth/Qwen3-VL-4B-Instruct** worked fine and was successfully trained on the dataset.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bug] Can't train Qwen3.5 or Gemma4 on multimodal datasets in Unsloth Studio #4859

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[Bug] Can't train Qwen3.5 or Gemma4 on multimodal datasets in Unsloth Studio #4859

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions