Skip to content

[Bug] Can't train Qwen3.5 or Gemma4 on multimodal datasets in Unsloth Studio #4859

@virusapex

Description

@virusapex

Note: Please do not remove the questions. Answer beside them.

  1. Did you update? pip install --upgrade unsloth unsloth_zoo
    I am running a Docker image with tag 2026.4.2-pt2.9.0-vllm-0.16.0-cu12.8-studio-release-v0.1.35-beta
  2. Colab or Kaggle or local / cloud
    Local
  3. Number GPUs used, use nvidia-smi
    A single H100 GPU
  4. Which notebook? Please link!
    Unsloth Studio
  5. Which Unsloth version, TRL version, transformers version, PyTorch version?
    Unsloth 2026.4.2, TRL 0.23.1, Transformers 4.57.1, PyTorch 2.9.1+cu128
  6. Which trainer? SFTTrainer, GRPOTrainer etc
    I assume it is SFT by default, but there is no visualization in Studio.

I tried my custom .parquet dataset with images and text columns, as well as unsloth/LaTeX_OCR example, but both of them produce the same Text model is not compatible with a multimodal dataset. Switch to a vision model or choose a text-only dataset. on unsloth/Qwen3.5-4B and unsloth/gemma-4-E2B-it even though they both have vision capabilities. However, unsloth/Qwen3-VL-4B-Instruct worked fine and was successfully trained on the dataset.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions