-
Notifications
You must be signed in to change notification settings - Fork 2
CLI Reference
🌐 Language: English | Français
Complete reference for every command shipped by dictee. All commands are installed under /usr/bin/ (system) or ~/.local/bin/ (user install).
Three categories: shell entrypoints (dictee, dictee-setup, dictee-switch-backend), Python helpers (dictee-tray, dictee-postprocess, dictee-test-rules), and Rust binaries (transcribe, transcribe-daemon, transcribe-client, transcribe-diarize, transcribe-stream-diarize, transcribe-diarize-batch).
Main entrypoint. Toggles dictation on/off, or runs one-shot modes via flags.
Synopsis:
dictee [--translate] [--meeting] [--cancel] [--setup] [--reset]
Flags:
| Flag | Description |
|---|---|
| (none) | Toggle dictation — press once to start, press again to stop |
--translate |
Transcribe + translate to $DICTEE_LANG_TARGET
|
--meeting |
Transcribe with diarization (speaker labels) |
--cancel |
Cancel the current recording |
--setup |
Launch dictee-setup (equivalent to running setup wizard) |
--reset |
Clear state files in /dev/shm — use if dictee is stuck |
--help |
Print help |
Examples:
# Plain dictation
dictee
# Translate FR → EN
DICTEE_LANG_SOURCE=fr DICTEE_LANG_TARGET=en dictee --translate
# Meeting mode (diarization)
dictee --meeting
# Cancel ongoing recording
dictee --cancelPyQt6 GUI for full configuration. Launches the setup wizard on first run, or opens the main configuration panel on subsequent runs.
Synopsis:
dictee-setup [--wizard] [--tab=<name>] [--lang=<xx>]
Flags:
| Flag | Description |
|---|---|
| (none) | Open main config panel |
--wizard |
Force wizard mode (even if already configured) |
--tab=<name> |
Jump to tab: asr, translation, postprocess, rules, shortcuts, plasmoid, advanced
|
--lang=<xx> |
Start with UI locale xx (overrides $LANG) |
Switch ASR or translation backend without restart.
Synopsis:
dictee-switch-backend <subcommand> [args]
Subcommands:
| Command | Description |
|---|---|
status |
Show current ASR + translation backends |
list |
List all available backends |
asr <name> |
Switch ASR to parakeet / canary / whisper / vosk
|
translate <name> |
Switch translation to canary / libretranslate / ollama / google / bing
|
Examples:
dictee-switch-backend status
# → ASR: parakeet (dictee.service, active)
# → Translate: google (trans)
dictee-switch-backend asr canary
dictee-switch-backend translate ollamaPush-to-talk daemon that listens to /dev/input/event* for keycodes and triggers dictation. Alternative to native KDE/GNOME shortcuts.
Synopsis:
dictee-ptt [--mode=toggle|hold|double-tap] [--key=<code>] [--key-translate=<code>] [--mod-translate=alt|ctrl|shift|super] [--double-tap-ms=<ms>]
See Keyboard-Shortcuts#dictee-ptt-daemon for detailed usage. Normally managed by the systemd user service dictee-ptt.service.
System tray icon. See Tray-Icon.
Synopsis:
dictee-tray [--debug] [--no-autostart]
Normally started via the dictee-tray.service systemd user unit.
Runs the 12-step post-processing pipeline on stdin or a given text, outputs the result on stdout.
Synopsis:
dictee-postprocess --lang <xx> [--llm-position=first|last|hybrid|off]
[--no-translate] [--rules=<path>] [--dict=<path>]
Examples:
echo "hello comma how are you" | dictee-postprocess --lang en
# → "hello, how are you"
# Disable LLM for a test
echo "rough asr output" | dictee-postprocess --lang en --llm-position=off
# Custom rules file
echo "text" | dictee-postprocess --lang en --rules ~/my-rules.confInteractive test harness for regex rules. See Rules-and-Dictionary#testing-rules-live.
Synopsis:
dictee-test-rules [--loop] [--wav <file>] [--rules <path>] [--lang <xx>]
Examples:
# Interactive — type text, see output
dictee-test-rules --lang fr
# Continuous loop for rapid iteration
dictee-test-rules --loop
# Test a real audio file
dictee-test-rules --wav recording.wavOne-shot CLI transcription of a WAV file. Not used by the dictation flow (which uses the daemon) but handy for batch transcription.
Synopsis:
transcribe <audio.wav> [--model <path>] [--lang <xx>]
transcribe-daemon holds the model in memory and listens on a Unix socket. transcribe-client connects to the daemon and sends audio/requests.
Daemon synopsis:
transcribe-daemon [--parakeet|--canary|--nemotron] [--model <path>] [--lang <xx>]
Client synopsis:
transcribe-client <audio.wav> # transcribe a file
transcribe-client --record # record from mic and transcribe
Socket path: $XDG_RUNTIME_DIR/transcribe.sock (defaults to /run/user/$UID/transcribe.sock).
Sortformer + Parakeet-TDT in one call. Transcribes + assigns speaker labels.
Synopsis:
transcribe-diarize <audio.wav> \
--sortformer-model <path> \
--parakeet-model <path> \
[--lang <xx>] [--format=plain|rttm|json|srt]
Caveats: the single-shot binary hits a ~5:20 min hard cap on every GPU due to a known bug in the Parakeet-TDT v3 model. For long files, use transcribe-diarize-batch (see below) or dictee-transcribe (the GUI), which auto-routes through the chunked binary since v1.3.4. More context: Diarization#combined-transcription-mode.
Sortformer + Nemotron (EN only) in streaming mode. No duration cap.
Synopsis:
transcribe-stream-diarize <audio.wav> \
--sortformer-model <path> \
--nemotron-model <path> \
[--format=plain|rttm|json|srt]
Chunked pipeline for long audio with diarization. Runs Sortformer once globally, splits the audio, transcribes each chunk with Parakeet, then merges speaker labels via timestamps. Each chunk stays under the ~5:20 min Parakeet bug, so audio of any length works (54-min keynote diarized in 122 s on 8 GB).
Synopsis:
transcribe-diarize-batch <audio.wav> \
--sortformer-model <path> \
--parakeet-model <path> \
--chunk-sec=600 --overlap-sec=10 \
[--lang <xx>] [--format=plain|rttm|json|srt]
Since v1.3.4, dictee-transcribe (the GUI) wires this binary automatically for any file long enough to benefit from chunking — no manual invocation needed.
Runtime configuration. Set via dictee.conf or exported in your shell:
| Var | Values | Default | Description |
|---|---|---|---|
DICTEE_ASR_BACKEND |
parakeet, canary, whisper, vosk
|
parakeet |
Active ASR backend |
DICTEE_LANG_SOURCE |
ISO 639-1 | $LANG |
Source language |
DICTEE_LANG_TARGET |
ISO 639-1 | en |
Translation target |
DICTEE_TRANSLATE |
0 / 1
|
0 |
Never persisted — ephemeral only |
DICTEE_MEETING |
0 / 1
|
0 |
Meeting mode (diarization) |
| Var | Values | Description |
|---|---|---|
DICTEE_PP_LLM |
0 / 1
|
Disable LLM correction for this dictation |
DICTEE_PP_NUMBERS |
0 / 1
|
Toggle number conversion |
DICTEE_LLM_MIN_WORDS |
integer | Skip LLM for dictations shorter than N words |
| Var | Default | Description |
|---|---|---|
DICTEE_OLLAMA_HOST |
http://localhost:11434 |
Ollama endpoint |
DICTEE_OLLAMA_MODEL |
gemma3:4b |
Ollama model tag |
DICTEE_LIBRETRANSLATE_HOST |
http://localhost:5000 |
LibreTranslate endpoint |
DICTEE_WHISPER_MODEL |
large-v3-turbo |
faster-whisper model |
DICTEE_VOSK_MODEL |
(auto) | Vosk model path |
| Var | Default | Description |
|---|---|---|
DICTEE_FORCE_CPU |
0 |
Force CPU even if GPU is available |
DICTEE_INTRA_THREADS |
(auto) | ONNX Runtime intra-op thread count (default: min(8, nproc)). Lower (2-4) to save battery on laptops or share CPU with other workloads. Do not exceed your CPU thread count. |
DICTEE_PARAKEET_QUANT |
(auto) | Parakeet model variant: int8 (~670 MB, ~34 % faster on CPU with AVX-VNNI) or fp32 (~2.4 GB, fastest on GPU). Both variants can coexist on disk; this var selects which one is loaded. Default: auto-detected from VRAM (int8 if < 4 GB or CPU-only, else fp32). |
DICTEE_DEBUG |
0 |
Verbose logging |
DICTEE_STATE_DIR |
/dev/shm |
Override state file location |
| Code | Meaning |
|---|---|
0 |
Success |
1 |
Generic error |
2 |
Invalid arguments |
3 |
Daemon not running / connection refused |
4 |
Model file missing |
5 |
Audio capture failed |
6 |
Transcription failed |
7 |
Translation failed |
8 |
Post-processing error |
126 |
Permission denied (e.g., /dev/input without input group) |
Runtime state lives in /dev/shm (tmpfs, cleared on reboot). Multi-user safe via UID suffix.
| File | Purpose |
|---|---|
/dev/shm/.dictee_state_<UID> |
Current status: idle / recording / transcribing / offline
|
/dev/shm/.dictee_state_<UID>.lock |
flock coordination for concurrent writes |
/dev/shm/.dictee_toggles_<UID> |
LLM / Short / Meeting toggle state |
/dev/shm/.dictee_continuation_<UID> |
Last-word buffer for continuation feature |
$XDG_RUNTIME_DIR/transcribe.sock |
Daemon Unix socket |
~/.config/dictee.conf |
Persistent user configuration |
~/.config/dictee/rules.conf |
User regex rules (merged on top of system defaults) |
~/.config/dictee/dictionary.conf |
User dictionary |
~/.config/dictee/short_text_keepcaps.conf |
User keepcaps exceptions |
- Troubleshooting — diagnose CLI errors
- Developer-Guide — binary source architecture
-
Keyboard-Shortcuts —
dictee-pttin depth -
Post-Processing-Overview — what
dictee-postprocessdoes
Getting started / Premiers pas
- Installation · 🇬🇧 · 🇫🇷
- Setup-Wizard · 🇬🇧 · 🇫🇷
- Configuration · 🇬🇧 · 🇫🇷
- Plasmoid-Widget · 🇬🇧 · 🇫🇷
- Tray-Icon · 🇬🇧 · 🇫🇷
- Keyboard-Shortcuts · 🇬🇧 · 🇫🇷
- Voice-Commands · 🇬🇧 · 🇫🇷
- GPU-Setup · 🇬🇧 · 🇫🇷
- Diarization · 🇬🇧 · 🇫🇷
- LLM-Diarization · 🇬🇧 · 🇫🇷
Speech recognition / ASR
Translation / Traduction
Post-processing / Post-traitement
- Overview · 🇬🇧 · 🇫🇷
- Rules-and-Dictionary · 🇬🇧 · 🇫🇷
- LLM-Correction · 🇬🇧 · 🇫🇷
- Numbers-Dates-Continuation · 🇬🇧 · 🇫🇷
CLI
Reference / Référence
🏠 Repo · 📦 Releases · 🐛 Issues