Skip to content

fix: smoother progress for xet downloads#4059

Open
tobocop2 wants to merge 3 commits intohuggingface:mainfrom
tobocop2:fix/xet-tqdm-granularity
Open

fix: smoother progress for xet downloads#4059
tobocop2 wants to merge 3 commits intohuggingface:mainfrom
tobocop2:fix/xet-tqdm-granularity

Conversation

@tobocop2
Copy link
Copy Markdown

@tobocop2 tobocop2 commented Apr 6, 2026

Fixes #4058.

Problem

Xet downloads barely report progress. For a 2.5GB file the bar updates about 9 times total -- 0% → 1.7% → 5.6% → 11% → 25% → 84% → 100%. That 25% to 84% gap is ~1.5GB of nothing. Looks completely stuck.

The cause: xet_get() and download_bucket_files() use a 1-arg callback. xet-core supports a 2-arg signature that gives fine-grained network-level updates instead.

Fix

make_xet_progress_callback(progress_bar, file_size) returns a 2-arg callback so xet-core gives us useful progress. Transfer bytes are scaled to file size so the bar tracks 0-100% correctly (xet dedup can make transfer size differ from file size).

Wired into both xet_get() and download_bucket_files(). No public API changes.

Demo

Qwen3 4B (2.5 GB, xet-stored) with a custom tqdm_class callback. Builds on #4056 which fixes custom tqdm_class in non-TTY environments.

Before (bar appears stuck)

Old coarse progress

After (smooth progress)

New fine-grained progress

Note

Both downloads finish in roughly the same time (~80s). The difference is purely in how progress is reported.

Multi-file: snapshot_download Qwen3-8B (5 xet shards, 16.4 GB)

Before (bar barely moves)

Old snapshot progress

After (smooth per-file progress)

New snapshot progress

Related


Note

Low Risk
Low risk: changes are limited to progress callback wiring/scaling for Xet downloads and are covered by new unit tests; core download/auth flows are otherwise unchanged.

Overview
Xet downloads now report smooth, accurate tqdm progress. A new make_xet_progress_callback helper generates the 2-argument xet-core callback and scales network transfer bytes to the expected file size (including capping and handling unknown sizes).

This callback replaces the previous 1-arg updater in both xet_get() and HfApi.download_bucket_files(), and adds targeted tests to validate the callback signature and scaling behavior (including multi-file shared progress bars).

Reviewed by Cursor Bugbot for commit d29edc7. Bugbot is set up for automated code reviews on this repo. Configure here.

@tobocop2 tobocop2 force-pushed the fix/xet-tqdm-granularity branch from e124bb4 to dcc4347 Compare April 6, 2026 17:06
@tobocop2 tobocop2 marked this pull request as draft April 6, 2026 17:10
@tobocop2 tobocop2 force-pushed the fix/xet-tqdm-granularity branch 2 times, most recently from 84bde27 to 205c369 Compare April 6, 2026 17:15
@tobocop2 tobocop2 marked this pull request as ready for review April 6, 2026 17:17
@tobocop2 tobocop2 force-pushed the fix/xet-tqdm-granularity branch from 205c369 to a58216f Compare April 6, 2026 17:20
@tobocop2 tobocop2 changed the title fix: use fine-grained xet-core callback for smoother tqdm progress fix: smoother progress bars for xet downloads Apr 6, 2026
@tobocop2 tobocop2 changed the title fix: smoother progress bars for xet downloads fix: smoother progress for xet downloads Apr 6, 2026
@tobocop2 tobocop2 force-pushed the fix/xet-tqdm-granularity branch 4 times, most recently from 26d9fca to 9a69c7a Compare April 7, 2026 02:01
Switch xet_get() from a 1-arg to 2-arg callback so xet-core reports
progress frequently instead of barely at all.

Fixes huggingface#4058
@tobocop2 tobocop2 force-pushed the fix/xet-tqdm-granularity branch from 9a69c7a to 6a8f8a3 Compare April 7, 2026 04:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug: xet downloads barely report progress (bar appears stuck on large files)

1 participant