docs(parakeet): document int8 quantization variant (EN+FR)
- CLI-Reference: add DICTEE_PARAKEET_QUANT to Runtime env vars table
- Parakeet-TDT-Deep-Dive: new "Quantization choice — FP32 vs int8" section
with measured comparison table (latency, VRAM, WER) and hardware
recommendations.
Measured on i7-13700H + RTX 4070 (audio 16s, daemon preloaded, median 7 runs):
- CPU int8 is 34% faster than CPU FP32 (AVX-VNNI exploited)
- GPU int8 is 6× slower than GPU FP32 (current ONNX Runtime CUDA EP limitation)
- GPU VRAM int8 is 87% smaller than FP32 (useful when VRAM is constrained)
docs(cli): document DICTEE_INTRA_THREADS env var (EN+FR)
wiki: v1.3.4 changelog (user-facing) + fix obsolete VRAM-cap claims (EN+FR)
Changelog
- Add v1.3.4 (stable) section in user-facing prose: long files now work
on every host (CPU and GPU), PTT recording duration guard rail with
clear values per backend (Canary 2:30 / Parakeet 4:30), 5-site
target-tab UI hardening, translate skip messages in 6 languages,
diarize falls back to standalone Parakeet+Sortformer when Canary is
the live-dictation backend.
- Mark v1.3.3 as historical (drop "(stable)" qualifier on v1.3.3, bump
to v1.3.4).
Obsolete VRAM-cap claims fixed across pages
- The real cap on the raw `transcribe` CLI is the ~5:20 min Parakeet-TDT
v3 ONNX attention-mask bug, not VRAM. Same error on 8 GB or 24 GB.
- Home, ASR-Backends, CLI-Reference, Troubleshooting, dev-Diarization
updated accordingly (user-facing pages: minimal jargon; Deep-Dive
pages: full technical detail).
- Parakeet-TDT-Deep-Dive: tableau VRAM théorique conservé, nouvelle
section "The real cap: ~5:20 min ONNX attention-mask bug" + section
"Theoretical VRAM caps (only relevant on ≤ 4 GB GPUs)".
Other v1.3.4 doc updates
- Home: chunked diarize marked as shipped in v1.3.4 (was "(v1.4)").
- CLI-Reference: `transcribe-diarize-batch` section bumped (was
"v1.3 final, not wired into UI yet" → "wired in dictee-transcribe
since v1.3.4").
- Keyboard-Shortcuts: default cheatsheet shortcut Ctrl+Alt+F9 →
Shift+F9 (v1.3.4 default) + historical context.
- Troubleshooting: "use dictee-transcribe (chunked auto)" placed
first in the OOM workaround, with `ffmpeg -f segment` as the
manual-split alternative.
User-facing pages reformulated to drop dev jargon (cf.
feedback-wiki-user-not-developer.md): ONNX attention-mask /
`_has_cuda_build()` / `pgrep -x` / `/proc/<pid>/comm` / `/dev/shm` /
`kpackagetool6` reduced to user-impact prose. Deep-Dive and dev-*
pages keep full technical depth.
docs(wiki): rename FR pages with fr- prefix for auto-sidebar grouping
Rename all 20 French pages from <Name>-fr.md to fr-<Name>.md so they
cluster together in GitHub Wiki's alphabetical Pages list instead of
being interspersed with English pages by topic.
Updated all internal links across 41 files:
- Cross-language switchers at top of pages
- 'Étapes suivantes' / 'Next steps' link lists
- Inline links in page bodies
- _Sidebar.md FR link
Python script replaces '<Name>-fr' with 'fr-<Name>' in order of
decreasing length to avoid partial matches. No manual edits.