Skip to content

History

Revisions

  • Diarization: note the 4-speaker limit should be lifted in v1.4 (EN+FR)

    @rcspam rcspam committed Jun 2, 2026
    fc81e0e
  • docs(troubleshooting): document Wayland clipboard / Electron typing issue (EN+FR) New section in both Troubleshooting pages explaining why enabling the "Copy transcription to clipboard" option breaks F9 typing into Electron apps on Wayland with wl-clipboard < 2.3.0, how to fix (upgrade or keep the toggle off), and the GNOME caveat (mutter#524, ext-data-control still missing in stable Mutter).

    @rcspam rcspam committed May 29, 2026
    9810199
  • docs(shortcuts): document DICTEE_PTT_EXTRA_DEVICES (#10) — EN+FR New section under the dictee-ptt daemon: 'Triggering dictation with a mouse button or a custom key'. Covers the use case (logiops, keyd, kanata, xremap, input-remapper), both the GUI path (dictee-setup Shortcuts tab → Detect…) and the manual path (editing dictee.conf). User-facing tone, no jargon, available since v1.3.5.

    @rcspam rcspam committed May 21, 2026
    64f0a7f
  • docs(wiki): new "How-It-Works" page — accessible overview of the dictee → ORT → ONNX stack (EN+FR) Target audience: non-developers who want to understand what happens under the hood when they press F9. No programming background required. Page structure: - "What happens when you press F9?" — 1-sentence summary - Visual stack diagram (9 layers, ASCII) - Each layer explained with a clear role + restaurant metaphor 1. User pressing F9 2. dictee shell script (the conductor) 3. transcribe-client (the courier) 4. transcribe-daemon (the resident scribe) 5. parakeet-rs (the classroom) 6. ort (the interpreter) 7. libonnxruntime.so (the engine room) 8. Execution Provider (CPU/GPU hardware driver) 9. ONNX model (the trained brain) - Why this layered architecture - Table of where each layer lives on disk - Restaurant metaphor recap - Links to deeper technical pages Also link to the new page from Home.md and fr-Home.md under "I want to understand how it works".

    @rcspam rcspam committed May 15, 2026
    cc9bb0d
  • docs(parakeet): document int8 quantization variant (EN+FR) - CLI-Reference: add DICTEE_PARAKEET_QUANT to Runtime env vars table - Parakeet-TDT-Deep-Dive: new "Quantization choice — FP32 vs int8" section with measured comparison table (latency, VRAM, WER) and hardware recommendations. Measured on i7-13700H + RTX 4070 (audio 16s, daemon preloaded, median 7 runs): - CPU int8 is 34% faster than CPU FP32 (AVX-VNNI exploited) - GPU int8 is 6× slower than GPU FP32 (current ONNX Runtime CUDA EP limitation) - GPU VRAM int8 is 87% smaller than FP32 (useful when VRAM is constrained)

    @rcspam rcspam committed May 15, 2026
    a65a303
  • docs(cli): document DICTEE_INTRA_THREADS env var (EN+FR)

    @rcspam rcspam committed May 15, 2026
    6ba5653
  • wiki: v1.3.4 changelog (user-facing) + fix obsolete VRAM-cap claims (EN+FR) Changelog - Add v1.3.4 (stable) section in user-facing prose: long files now work on every host (CPU and GPU), PTT recording duration guard rail with clear values per backend (Canary 2:30 / Parakeet 4:30), 5-site target-tab UI hardening, translate skip messages in 6 languages, diarize falls back to standalone Parakeet+Sortformer when Canary is the live-dictation backend. - Mark v1.3.3 as historical (drop "(stable)" qualifier on v1.3.3, bump to v1.3.4). Obsolete VRAM-cap claims fixed across pages - The real cap on the raw `transcribe` CLI is the ~5:20 min Parakeet-TDT v3 ONNX attention-mask bug, not VRAM. Same error on 8 GB or 24 GB. - Home, ASR-Backends, CLI-Reference, Troubleshooting, dev-Diarization updated accordingly (user-facing pages: minimal jargon; Deep-Dive pages: full technical detail). - Parakeet-TDT-Deep-Dive: tableau VRAM théorique conservé, nouvelle section "The real cap: ~5:20 min ONNX attention-mask bug" + section "Theoretical VRAM caps (only relevant on ≤ 4 GB GPUs)". Other v1.3.4 doc updates - Home: chunked diarize marked as shipped in v1.3.4 (was "(v1.4)"). - CLI-Reference: `transcribe-diarize-batch` section bumped (was "v1.3 final, not wired into UI yet" → "wired in dictee-transcribe since v1.3.4"). - Keyboard-Shortcuts: default cheatsheet shortcut Ctrl+Alt+F9 → Shift+F9 (v1.3.4 default) + historical context. - Troubleshooting: "use dictee-transcribe (chunked auto)" placed first in the OOM workaround, with `ffmpeg -f segment` as the manual-split alternative. User-facing pages reformulated to drop dev jargon (cf. feedback-wiki-user-not-developer.md): ONNX attention-mask / `_has_cuda_build()` / `pgrep -x` / `/proc/<pid>/comm` / `/dev/shm` / `kpackagetool6` reduced to user-impact prose. Deep-Dive and dev-* pages keep full technical depth.

    @rcspam rcspam committed May 12, 2026
    3f62c41
  • v1.3.3 — bump install commands + add changelog entries (EN+FR)

    @rcspam rcspam committed May 7, 2026
    fd16d62
  • wiki: bump v1.3.1 → v1.3.2, add v1.3.2 changelog entry (EN+FR), update Installation pages

    @rcspam rcspam committed May 5, 2026
    28b65ab
  • wiki: fix v1.3.1 lede — phrasing implied parallel runs (we actually block them)

    @rcspam rcspam committed May 4, 2026
    72712a2
  • wiki: v1.3.1 — Changelog, Troubleshooting (CUDA fallback + uinput Fedora), bumped install URLs

    @rcspam rcspam committed May 4, 2026
    d9741b6
  • v1.3.0 stable: refresh Installation/Troubleshooting/Changelog, drop obsolete v1.3 roadmap

    @rcspam rcspam committed May 3, 2026
    077878d
  • ASR-Backends: note that Nemotron is no longer in setup since v1.3

    @rcspam rcspam committed May 3, 2026
    0329638
  • Translation: add 'Translating an audio file (dictee-transcribe)' section

    @rcspam rcspam committed May 3, 2026
    b243578
  • Diarization: expand 'Going further with LLM' section with screenshots

    @rcspam rcspam committed May 3, 2026
    0085a17
  • Diarization: refresh export section for single-tab dialog (1.3.0)

    @rcspam rcspam committed May 3, 2026
    986acf2
  • Diarization: illustrate the Threshold slider in the explanation box

    @rcspam rcspam committed May 3, 2026
    385181d
  • Diarization: illustrate the audio/text sync toggles section

    @rcspam rcspam committed May 3, 2026
    a4c1826
  • Diarization: explain the 3 audio/text sync toggles Adds a 'Sync audio and text' subsection between the timeline tips and the Rename speakers section, documenting the three toggles that sit under the Rename accordion header: - Follow playback in text (text cursor follows audio) - Auto-play on text click (clicking a segment seeks audio) - Highlight current segment (underline the segment being played) A 3-row table shows what each does and when to turn it on. Both EN and FR pages updated.

    @rcspam rcspam committed May 3, 2026
    f8f122c
  • _Sidebar: move Diarization + LLM-Diarization under Getting started The dedicated 'Diarization / Diarisation' section had only two entries — folding them into 'Getting started / Premiers pas' makes the sidebar shorter and surfaces both pages directly to new users following the standard onboarding path.

    @rcspam rcspam committed May 3, 2026
    41fddec
  • Diarization: drop 'braille spinner' jargon, keep prose lighter The braille pattern reference (⠋⠙⠹⠸…) was technical noise for end users. Replaced with a parenthetical 'with a spinner turning in the tab title' that flows better. Both EN and FR pages updated.

    @rcspam rcspam committed May 3, 2026
    94529f8
  • Diarization: explain the Threshold slider with onset/offset semantics Replaces the one-line aside with a dedicated box that maps the slider position to the actual model thresholds (onset/offset), gives a 3-row recommendation table (low / 50% / high) with real-world use cases, and reminds readers that re-running with a new value spawns a comparable tab (title contains the value). Both EN and FR pages updated.

    @rcspam rcspam committed May 3, 2026
    df780c5
  • Diarization: correct prev/next segment glyphs to match the actual buttons The wiki had mistakenly used ◀ / ▶ (single triangles, the seek-to-start / end buttons in the player toolbar) for the prev/next segment buttons. The actual glyphs in the code (dictee-transcribe.py:2092, 2098) are ⏮ / ⏭ — double-arrow with vertical bar.

    @rcspam rcspam committed May 3, 2026
    cadc89f
  • Diarization: illustrate Export section + mention multi-tab export

    @rcspam rcspam committed May 3, 2026
    125a81c
  • Diarization: illustrate timeline + rename sections with screenshots Adds two PNG screenshots inline to break up the wall of text and make the visual cues (coloured bands, red triangle, Rename accordion layout) immediately recognisable: - diarization-timeline_1.3.png: zoom on the audio player timeline with per-speaker coloured bands and the red playback triangle. - diarization-rename_1.3.png: the Rename speakers accordion with the colour swatches, name fields and Apply / Reset buttons. EN and FR pages updated symmetrically.

    @rcspam rcspam committed May 3, 2026
    cba39c7
  • Diarization: replace static screenshot with animated demo, refresh prose - Static diarization-1_1.3.png replaced by an animated GIF (diarization-demo_1.3.gif, 38 frames @ 1 fps, 1569×1273) so readers see the full Transcribe → diarize → rename → playback flow at a glance. - Prose rewritten to be more user-friendly and reflect recent improvements: * mention the violet Transcribe button (new layout) * mention the per-tab braille spinner during work * mention the red triangle playback cursor and prev/next segment buttons on the timeline * dedicated 'Export' section listing the three formats and confirming rename propagation * dedicated 'LLM analysis' teaser pointing to the LLM page, with the available backends listed * remove the dated 'used to cap at ~10 min' wording — chunked pipeline is the current normal

    @rcspam rcspam committed May 3, 2026
    0cf7bba
  • LLM-Diarization: document Disable thinking + cancellation Two recently shipped features missing from the wiki: - "Disable thinking" checkbox (commit 480000a): for reasoning models (qwen3, deepseek-r1, gpt-oss). Visible for Ollama only, next to Context window. Strips the <think>...</think> preamble. - Cooperative cancellation (commit af1d9d3): closing the LLM result tab while the model is generating now aborts the live HTTP stream — saves cloud tokens, frees the GPU on Ollama. Both EN and FR pages updated.

    @rcspam rcspam committed May 3, 2026
    ac4e10a
  • LLM-Diarization: keep VRAM tip generic (no Gemma 3 4B / 8 GB hardcoded) The "run diarization first, then LLM" tip was specific to Gemma 3 4B on an 8 GB GPU. The same advice applies to any local LLM on any size GPU shared with the transcription pipeline — generalised the wording so users with different setups don't think it doesn't apply to them. Both EN and FR pages updated.

    @rcspam rcspam committed May 3, 2026
    7478cbb
  • LLM-Diarization: document context window per backend The 'Context window' field added to the provider editor in dictee is Ollama-only — for LM Studio / Jan / vLLM / cloud backends, the limit must be raised in the server's own UI/CLI or is fixed by the model. Adds a per-backend table after 'Edit / delete' explaining where to set Context Length so users with non-Ollama backends don't think the dictee field has any effect for them. Both EN and FR pages updated.

    @rcspam rcspam committed May 3, 2026
    b6fd42b
  • docs(wiki): split Diarization and CLI into separate sidebar sections

    @rcspam rcspam committed May 2, 2026
    d67f0c4