Skip to content

Playground v2.5 + SDXL Lightning: add PCC-gated nightly + demo scripts#5480

Open
kamalrajkannan78 wants to merge 1 commit into
mainfrom
kkannan/jun1_add_demo_and_pcc_check
Open

Playground v2.5 + SDXL Lightning: add PCC-gated nightly + demo scripts#5480
kamalrajkannan78 wants to merge 1 commit into
mainfrom
kkannan/jun1_add_demo_and_pcc_check

Conversation

@kamalrajkannan78

@kamalrajkannan78 kamalrajkannan78 commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

Ticket

Problem description

  • E2E pipeline tests for Playground v2.5 and SDXL-Lightning had no correctness gate — they ran the full pipeline on TT but never compared outputs against a CPU reference, so PCC drops on any component (text encoders, UNet, VAE) were silently ignored.
  • No standalone Python demo scripts exist for users to run these models end-to-end.
  • Nightly tests included post-processing (image save to PNG) that belongs in a demo, not a correctness gate.
  • Redundant per-component tests exist as the only source of numerical checks — kept alive purely because the e2e nightly lacked PCC comparisons.

What's changed

Split responsibilities into three: correctness gate, demo, and cleanup.

  • PCC-gated nightlies (tests/torch/models/{sdxl_lightning,playground_v2_5}/test_*_pipeline.py): after every TT component forward, the same input tensors are fed to a lazy-loaded CPU twin and PCC is compared inline. Any component's PCC below 0.99 fails the test immediately with a clear message. UNet asserts per denoising step (fail-fast on the first bad step). No more PNG saving in the test — correctness is measured against tensors, not files.
  • Demo scripts (examples/pytorch/sdxl_lightning.py, examples/pytorch/playground_v2_5.py): copies of the pipeline classes without pytest / PCC / assertions, with a __main__ block that saves an output PNG. Meant for humans to python examples/pytorch/<model>.py and see the pipeline produce an image.
  • Component test removal: dropped 8 redundant per-component tests (test_text_encoder.py, test_text_encoder_2.py, test_unet.py, test_vae_decoder.py) across both model directories. Their coverage is now subsumed by the per-component PCC assertions in the nightly pipeline test.

Impact: any component-level numerical regression on TT is now caught in nightly with a clear per-component PCC log and an assertion. Fewer test files to maintain, one canonical demo path per model.

Checklist

  • Verify the changes local testing/ Run test single in both n150 & p150/p100

Logs

Note : E2E Demo outputs are present in demo zip files

@kamalrajkannan78 kamalrajkannan78 force-pushed the kkannan/jun1_add_demo_and_pcc_check branch 2 times, most recently from 3f26754 to ead7d0b Compare July 1, 2026 10:17
@codecov-commenter

codecov-commenter commented Jul 1, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 33.85%. Comparing base (827ca57) to head (796e0ee).
⚠️ Report is 5 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #5480   +/-   ##
=======================================
  Coverage   33.84%   33.85%           
=======================================
  Files          37       37           
  Lines        4990     4989    -1     
=======================================
  Hits         1689     1689           
+ Misses       3301     3300    -1     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

@kamalrajkannan78 kamalrajkannan78 force-pushed the kkannan/jun1_add_demo_and_pcc_check branch from 9ca1129 to 796e0ee Compare July 2, 2026 03:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Model Bringup: Playground v2.5 Model Bringup: SDXL Lightning

2 participants