[Test] Mark kokoro/Kokoro-82M single-device inference KNOWN_FAILURE_X…#5348
Draft
saiarthiraguram wants to merge 1 commit into
Draft
[Test] Mark kokoro/Kokoro-82M single-device inference KNOWN_FAILURE_X…#5348saiarthiraguram wants to merge 1 commit into
saiarthiraguram wants to merge 1 commit into
Conversation
…FAIL Compiles + runs E2E on n150 with trained weights and produces a finite, sane-magnitude waveform, but waveform PCC tops out at ~0.14 (need 0.99): the iSTFTNet vocoder (sine-phase cumsum + STFT/iSTFT over a long 1-D waveform) is sensitive to bf16 tensor storage. Decoder output PCC is already 0.948 and the vocoder collapses it to 0.138; fp32_dest_acc_en+hifi4 is bit-identical (storage-bound, not accumulation-bound). Tracked in #5332. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Ticket
Github Issue
Problem description
Bring up
hexgrad/Kokoro-82M(StyleTTS2 TTS / iSTFTNet vocoder) on the tt-xlaPyTorch runner. The model now has a tt-forge-models loader (companion PR:
tenstorrent/tt-forge-models#); this PR wires it into the tt-xla single-device
inference suite and records its status.
The model compiles and runs end-to-end on n150 with trained weights and produces
a finite, sane-magnitude waveform, but waveform PCC tops out at ~0.14 (need
0.99): the iSTFTNet vocoder (sine-phase
cumsum+ STFT/iSTFT over a long 1-Dwaveform) is sensitive to bf16 tensor storage. Decoder output PCC is already
0.948 and the vocoder collapses it to 0.138;
fp32_dest_acc_en+hifi4 isbit-identical (storage-bound, not accumulation-bound). Root cause and the full
device-vs-CPU bisect are in #5332.
What's changed
third_party/tt_forge_models— submodule pointer bumped to include theKokoro loader (companion tt-forge-models PR). (Apply after that PR merges.)
tests/runner/test_config/torch/test_config_inference_single_device.yaml—new
kokoro/pytorch-hexgrad/Kokoro-82M-single_device-inferenceentry,KNOWN_FAILURE_XFAIL, reason documenting the bf16 waveform-PCC wall andlinking Kokoro-82M: whole-model PCC collapses (-0.0019, best 0.24) — bf16 tile storage quantizes large iSTFTNet sine-phase accumulation, causing catastrophic cancellation #5332.
python_package/tt_torch/utils.py— Dynamo guard-repr patch fix:self.get(guard)→self.get(guard.name)(a prerequisite that unblockedcompilation during bringup). (Drop this hunk if it has already landed on
main from another change.)
tests/torch/ops/kokoro/— two self-contained single-device op sanitiesreproducing the underlying device numerical-robustness gap (bf16
InstanceNorm1dvariance catastrophic-cancellation on a near-constant input→
rsqrt→ inf/FLT_MAX/nan):test_adain_sanity.py(minimal bare op) andtest_adain_chain_sanity.py(Conv1d → AdaIN1d, the Kokoro structure). Bothxfail(strict=False), #5332. *(Currently staged on branchsai_arthi_raguram/kokoro-instancenorm-bf16-repro)Checklist
n150 (xfail on PCC); op sanities reproduce the root-cause gap in isolation.
Logs
debug_report.md
iter_n150_13_fp32acc_realw_run.log