Skip to content

fix: load embed_tokens and norm for ColQwen2/ColQwen2_5 on transformers 5.x#414

Merged
QuentinJGMace merged 2 commits into
illuin-tech:mainfrom
whybe-choi:fix-colqwen2-key-mapping
Jun 10, 2026
Merged

fix: load embed_tokens and norm for ColQwen2/ColQwen2_5 on transformers 5.x#414
QuentinJGMace merged 2 commits into
illuin-tech:mainfrom
whybe-choi:fix-colqwen2-key-mapping

Conversation

@whybe-choi

Copy link
Copy Markdown
Contributor

Summary

Fixes #413.

On transformers 5.x, Qwen2VLModel.__init__ nests the text backbone under language_model, so the expected keys are language_model.embed_tokens, language_model.layers.*, language_model.norm. Released Qwen2-VL checkpoints store these as model.embed_tokens, model.layers.*, model.norm.

ColQwen2._checkpoint_conversion_mapping only remapped model.layers, so embed_tokens and norm were silently dropped and randomly re-initialized when loading from a Qwen2-VL checkpoint. ColQwen2_5 had the same incomplete mapping and was affected identically.

Changes

Add the missing rules to both modeling_colqwen2.py and modeling_colqwen2_5.py:

_checkpoint_conversion_mapping = {
    r"^base_model\.model\.custom_text_proj": "custom_text_proj",
    r"^model\.layers": "language_model.layers",
    r"^model\.embed_tokens": "language_model.embed_tokens",
    r"^model\.norm": "language_model.norm",
}

This is backward-compatible: the rules only match keys starting with model., so checkpoints already saved in the language_model.* layout are unaffected.

Result

Before, language_model.embed_tokens.weight and language_model.norm.weight were reported MISSING while their source tensors were UNEXPECTED. After the fix only the expected custom_text_proj.{weight,bias} (the ColBERT projection head trained from scratch) remain MISSING.

…s 5.x

The text backbone is nested under language_model in transformers 5.x, but
released Qwen2-VL checkpoints store embed_tokens/norm under model.*. The
checkpoint conversion mapping only remapped model.layers, so embed_tokens
and norm were silently dropped and randomly re-initialized.

Add the missing rules so these pretrained tensors load correctly. The rules
only match keys starting with model., so checkpoints already saved in the
language_model.* layout are unaffected.

Fixes illuin-tech#413
@QuentinJGMace QuentinJGMace self-requested a review June 10, 2026 08:25

@QuentinJGMace QuentinJGMace left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for finding this and the fix ! Could you also update the changelog ?

Otherwise it looks good to me.

@whybe-choi

Copy link
Copy Markdown
Contributor Author

Done — added the changelog entry. Thanks!

@QuentinJGMace QuentinJGMace merged commit c23838d into illuin-tech:main Jun 10, 2026
6 checks passed
@whybe-choi whybe-choi deleted the fix-colqwen2-key-mapping branch June 16, 2026 01:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ColQwen2: embed_tokens and norm randomly initialized on transformers 5.x

2 participants