Skip to content

Upgrade transformers to 5.5.4 and huggingface-hub to 1.16#1524

Merged
dxqb merged 7 commits into
Nerogar:masterfrom
dxqb:transformers-5.5.4
Jun 18, 2026
Merged

Upgrade transformers to 5.5.4 and huggingface-hub to 1.16#1524
dxqb merged 7 commits into
Nerogar:masterfrom
dxqb:transformers-5.5.4

Conversation

@dxqb

@dxqb dxqb commented Jun 14, 2026

Copy link
Copy Markdown
Collaborator

Summary

reopens #1472 but only upgrades to 5.5.4, right before huggingface/transformers#44431 was merged to avoid the issues described in #1506

Test plan

  • pre-commit run --all-files passes
  • Launched the affected UI or script and exercised the change
  • Tested with at least one real preset / config when relevant (note which: Lens, Ideogram, SDXL!)

AI assistance

  • AI-assisted — I have read every line in this diff and can defend each change

dxqb and others added 2 commits June 5, 2026 21:00
5.5.4 is the last release before CLIP flattening in 5.6, which avoids
the full CLIP-compat migration while still picking up the general v5
fixes from Nerogar#1472 (Trie removal, thread-safety, hub 1.16/xet cleanup).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@dxqb dxqb added the preview merged in the preview branch label Jun 16, 2026
@dxqb

This comment was marked as resolved.

HFModelLoaderMixin checked sub_module._checkpoint_conversion_mapping for
legacy checkpoint key renaming, but that attribute is just an empty {}
declared on the base PreTrainedModel class in this transformers version.
The actual renaming rules now live in transformers' central conversion
registry. Use get_checkpoint_conversion_mapping()/rename_source_key()
instead, which correctly remaps Ernie's Mistral3 text encoder
(language_model.model.* -> language_model.*) and Qwen's Qwen2_5_VL text
encoder (model.* -> model.language_model.*, visual.* -> model.visual.*).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@dxqb

This comment was marked as resolved.

dxqb and others added 2 commits June 17, 2026 21:28
T5EncoderModel's encoder.embed_tokens.weight is tied to shared.weight, saved
only once in the checkpoint. The manual loading path in
HFModelLoaderMixin.__load_sub_module only assigns tensors for keys present in
the state dict, so the tied key was left an empty meta tensor, crashing
Chroma's text_encoder, Flux's text_encoder_2 and SD3's text_encoder_3 with
"Cannot copy out of meta tensor; no data!".

Generalize the fix in __load_sub_module: for every _tied_weights_keys entry
still on the meta device, clone the already-loaded, already dtype-converted
source parameter into it. Cloning (rather than aliasing via the real
tie_weights()) keeps the two parameters as separate objects, so a later
in-place quantization of one side (e.g. quantize_layers() quantizing a
quantized lm_head) can't silently corrupt the other (e.g. the embedding
table) through a shared Parameter object -- verified empirically.

This subsumes the per-loader manual workaround already used for Qwen3-based
causal LM text encoders (Flux2, ZImage), which is removed.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
get_checkpoint_conversion_mapping(sub_module.config.model_type), added to
fix Ernie/Qwen text encoder loading, ran unconditionally in
__load_sub_module. Diffusers sub-modules (loaded via
_load_diffusers_sub_module, e.g. Anima's VAE) have a plain FrozenDict
config with no model_type attribute, crashing with AttributeError.
Diffusers has no such checkpoint-conversion registry and never needs this
renaming, so skip it when the config lacks model_type.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Comment thread modules/modelLoader/mixin/HFModelLoaderMixin.py Outdated
@dxqb

dxqb commented Jun 18, 2026

Copy link
Copy Markdown
Collaborator Author

smoke test passes on embeddings

OK: #sd 1.5 embedding
OK: #sd 2.1 embedding
OK: #sdxl 1.0 embedding
OK: #wuerstchen 2.0 embedding

@dxqb dxqb marked this pull request as ready for review June 18, 2026 19:18
@dxqb

dxqb commented Jun 18, 2026

Copy link
Copy Markdown
Collaborator Author

LoRA smoke test passed via preview branch

@dxqb dxqb merged commit 3e3b3e8 into Nerogar:master Jun 18, 2026
1 check passed
dxqb added a commit to dxqb/OneTrainer that referenced this pull request Jun 27, 2026
The ctk view refactor moved ConceptWindow.__download_dataset into the new
ConceptWindowController.download_dataset from a base predating PR Nerogar#1524, which
silently reverted that PR's fix: the method again called
huggingface_hub.login(token=..., new_session=False) with no empty-token guard.

new_session was removed from login() in huggingface-hub 1.16, so every call
raised TypeError, swallowed by the surrounding except, so snapshot_download
never ran and dataset download was fully broken. Re-apply Nerogar#1524 at the new
location: only login when a token is configured, and drop new_session.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

preview merged in the preview branch

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant