Skip to content

CLI Reference

rcspam edited this page May 15, 2026 · 8 revisions

🌐 Language: English | Français

CLI Reference

Complete reference for every command shipped by dictee. All commands are installed under /usr/bin/ (system) or ~/.local/bin/ (user install).

Three categories: shell entrypoints (dictee, dictee-setup, dictee-switch-backend), Python helpers (dictee-tray, dictee-postprocess, dictee-test-rules), and Rust binaries (transcribe, transcribe-daemon, transcribe-client, transcribe-diarize, transcribe-stream-diarize, transcribe-diarize-batch).

Table of Contents


Shell entrypoints

dictee

Main entrypoint. Toggles dictation on/off, or runs one-shot modes via flags.

Synopsis:

dictee [--translate] [--meeting] [--cancel] [--setup] [--reset]

Flags:

Flag Description
(none) Toggle dictation — press once to start, press again to stop
--translate Transcribe + translate to $DICTEE_LANG_TARGET
--meeting Transcribe with diarization (speaker labels)
--cancel Cancel the current recording
--setup Launch dictee-setup (equivalent to running setup wizard)
--reset Clear state files in /dev/shm — use if dictee is stuck
--help Print help

Examples:

# Plain dictation
dictee

# Translate FR → EN
DICTEE_LANG_SOURCE=fr DICTEE_LANG_TARGET=en dictee --translate

# Meeting mode (diarization)
dictee --meeting

# Cancel ongoing recording
dictee --cancel

dictee-setup

PyQt6 GUI for full configuration. Launches the setup wizard on first run, or opens the main configuration panel on subsequent runs.

Synopsis:

dictee-setup [--wizard] [--tab=<name>] [--lang=<xx>]

Flags:

Flag Description
(none) Open main config panel
--wizard Force wizard mode (even if already configured)
--tab=<name> Jump to tab: asr, translation, postprocess, rules, shortcuts, plasmoid, advanced
--lang=<xx> Start with UI locale xx (overrides $LANG)

dictee-switch-backend

Switch ASR or translation backend without restart.

Synopsis:

dictee-switch-backend <subcommand> [args]

Subcommands:

Command Description
status Show current ASR + translation backends
list List all available backends
asr <name> Switch ASR to parakeet / canary / whisper / vosk
translate <name> Switch translation to canary / libretranslate / ollama / google / bing

Examples:

dictee-switch-backend status
# → ASR: parakeet (dictee.service, active)
# → Translate: google (trans)

dictee-switch-backend asr canary
dictee-switch-backend translate ollama

dictee-ptt

Push-to-talk daemon that listens to /dev/input/event* for keycodes and triggers dictation. Alternative to native KDE/GNOME shortcuts.

Synopsis:

dictee-ptt [--mode=toggle|hold|double-tap] [--key=<code>] [--key-translate=<code>] [--mod-translate=alt|ctrl|shift|super] [--double-tap-ms=<ms>]

See Keyboard-Shortcuts#dictee-ptt-daemon for detailed usage. Normally managed by the systemd user service dictee-ptt.service.


Python helpers

dictee-tray

System tray icon. See Tray-Icon.

Synopsis:

dictee-tray [--debug] [--no-autostart]

Normally started via the dictee-tray.service systemd user unit.


dictee-postprocess

Runs the 12-step post-processing pipeline on stdin or a given text, outputs the result on stdout.

Synopsis:

dictee-postprocess --lang <xx> [--llm-position=first|last|hybrid|off]
                   [--no-translate] [--rules=<path>] [--dict=<path>]

Examples:

echo "hello comma how are you" | dictee-postprocess --lang en
# → "hello, how are you"

# Disable LLM for a test
echo "rough asr output" | dictee-postprocess --lang en --llm-position=off

# Custom rules file
echo "text" | dictee-postprocess --lang en --rules ~/my-rules.conf

dictee-test-rules

Interactive test harness for regex rules. See Rules-and-Dictionary#testing-rules-live.

Synopsis:

dictee-test-rules [--loop] [--wav <file>] [--rules <path>] [--lang <xx>]

Examples:

# Interactive — type text, see output
dictee-test-rules --lang fr

# Continuous loop for rapid iteration
dictee-test-rules --loop

# Test a real audio file
dictee-test-rules --wav recording.wav

Rust binaries

transcribe

One-shot CLI transcription of a WAV file. Not used by the dictation flow (which uses the daemon) but handy for batch transcription.

Synopsis:

transcribe <audio.wav> [--model <path>] [--lang <xx>]

Daemon / client

transcribe-daemon holds the model in memory and listens on a Unix socket. transcribe-client connects to the daemon and sends audio/requests.

Daemon synopsis:

transcribe-daemon [--parakeet|--canary|--nemotron] [--model <path>] [--lang <xx>]

Client synopsis:

transcribe-client <audio.wav>            # transcribe a file
transcribe-client --record               # record from mic and transcribe

Socket path: $XDG_RUNTIME_DIR/transcribe.sock (defaults to /run/user/$UID/transcribe.sock).


transcribe-diarize

Sortformer + Parakeet-TDT in one call. Transcribes + assigns speaker labels.

Synopsis:

transcribe-diarize <audio.wav> \
  --sortformer-model <path> \
  --parakeet-model <path> \
  [--lang <xx>] [--format=plain|rttm|json|srt]

Caveats: the single-shot binary hits a ~5:20 min hard cap on every GPU due to a known bug in the Parakeet-TDT v3 model. For long files, use transcribe-diarize-batch (see below) or dictee-transcribe (the GUI), which auto-routes through the chunked binary since v1.3.4. More context: Diarization#combined-transcription-mode.


transcribe-stream-diarize

Sortformer + Nemotron (EN only) in streaming mode. No duration cap.

Synopsis:

transcribe-stream-diarize <audio.wav> \
  --sortformer-model <path> \
  --nemotron-model <path> \
  [--format=plain|rttm|json|srt]

transcribe-diarize-batch

Chunked pipeline for long audio with diarization. Runs Sortformer once globally, splits the audio, transcribes each chunk with Parakeet, then merges speaker labels via timestamps. Each chunk stays under the ~5:20 min Parakeet bug, so audio of any length works (54-min keynote diarized in 122 s on 8 GB).

Synopsis:

transcribe-diarize-batch <audio.wav> \
  --sortformer-model <path> \
  --parakeet-model <path> \
  --chunk-sec=600 --overlap-sec=10 \
  [--lang <xx>] [--format=plain|rttm|json|srt]

Since v1.3.4, dictee-transcribe (the GUI) wires this binary automatically for any file long enough to benefit from chunking — no manual invocation needed.


Environment variables

Runtime configuration. Set via dictee.conf or exported in your shell:

Dictation mode

Var Values Default Description
DICTEE_ASR_BACKEND parakeet, canary, whisper, vosk parakeet Active ASR backend
DICTEE_LANG_SOURCE ISO 639-1 $LANG Source language
DICTEE_LANG_TARGET ISO 639-1 en Translation target
DICTEE_TRANSLATE 0 / 1 0 Never persisted — ephemeral only
DICTEE_MEETING 0 / 1 0 Meeting mode (diarization)

Post-processing

Var Values Description
DICTEE_PP_LLM 0 / 1 Disable LLM correction for this dictation
DICTEE_PP_NUMBERS 0 / 1 Toggle number conversion
DICTEE_LLM_MIN_WORDS integer Skip LLM for dictations shorter than N words

Backends

Var Default Description
DICTEE_OLLAMA_HOST http://localhost:11434 Ollama endpoint
DICTEE_OLLAMA_MODEL gemma3:4b Ollama model tag
DICTEE_LIBRETRANSLATE_HOST http://localhost:5000 LibreTranslate endpoint
DICTEE_WHISPER_MODEL large-v3-turbo faster-whisper model
DICTEE_VOSK_MODEL (auto) Vosk model path

Runtime

Var Default Description
DICTEE_FORCE_CPU 0 Force CPU even if GPU is available
DICTEE_INTRA_THREADS (auto) ONNX Runtime intra-op thread count (default: min(8, nproc)). Lower (2-4) to save battery on laptops or share CPU with other workloads. Do not exceed your CPU thread count.
DICTEE_PARAKEET_QUANT (auto) Parakeet model variant: int8 (~670 MB, ~34 % faster on CPU with AVX-VNNI) or fp32 (~2.4 GB, fastest on GPU). Both variants can coexist on disk; this var selects which one is loaded. Default: auto-detected from VRAM (int8 if < 4 GB or CPU-only, else fp32).
DICTEE_DEBUG 0 Verbose logging
DICTEE_STATE_DIR /dev/shm Override state file location

Exit codes

Code Meaning
0 Success
1 Generic error
2 Invalid arguments
3 Daemon not running / connection refused
4 Model file missing
5 Audio capture failed
6 Transcription failed
7 Translation failed
8 Post-processing error
126 Permission denied (e.g., /dev/input without input group)

State files

Runtime state lives in /dev/shm (tmpfs, cleared on reboot). Multi-user safe via UID suffix.

File Purpose
/dev/shm/.dictee_state_<UID> Current status: idle / recording / transcribing / offline
/dev/shm/.dictee_state_<UID>.lock flock coordination for concurrent writes
/dev/shm/.dictee_toggles_<UID> LLM / Short / Meeting toggle state
/dev/shm/.dictee_continuation_<UID> Last-word buffer for continuation feature
$XDG_RUNTIME_DIR/transcribe.sock Daemon Unix socket
~/.config/dictee.conf Persistent user configuration
~/.config/dictee/rules.conf User regex rules (merged on top of system defaults)
~/.config/dictee/dictionary.conf User dictionary
~/.config/dictee/short_text_keepcaps.conf User keepcaps exceptions

Next steps

📖 dictee Wiki

🇬🇧 Home · 🇫🇷 Accueil


Getting started / Premiers pas

Speech recognition / ASR

Translation / Traduction

Post-processing / Post-traitement

CLI

Reference / Référence


🏠 Repo · 📦 Releases · 🐛 Issues

Clone this wiki locally