WayWhisper

Zero-Cloud Voice-to-Text for Wayland

WayWhisper records your voice with a keyboard shortcut, transcribes it locally using whisper.cpp, and instantly types the result into your active window. No cloud, no latency, no privacy trade-offs.

Works on any modern Wayland desktop: COSMIC, Hyprland, GNOME, Sway, and more.

Features

Universal Wayland support — uses wtype for direct typing, with wl-clipboard as fallback for long text
NVIDIA CUDA acceleration — GPU-accelerated transcription for near-instant results
Multi-language — pass any Whisper language code as an argument, or set a default at install time
Translation mode — speak in one language, receive output in English (e.g. waywhisper de:en)
Smart notifications — animated spinner while recording; notification closes instantly via DBus when typing starts
PipeWire concurrent safety — records cleanly alongside other active audio streams (Zoom, Discord, etc.)

Prerequisites

Debian, Ubuntu, or Pop!_OS (or any apt-based distro). An NVIDIA GPU is optional but recommended for fast transcription.

Interactive Installer

The included setup.sh handles everything from scratch:

git clone https://github.com/dcwigk/WayWhisper.git
cd WayWhisper
bash setup.sh

It will:

Install runtime dependencies via apt (wtype, wl-clipboard, libnotify-bin, libglib2.0-bin, pipewire-audio-client-libraries, build tools)
Detect your GPU — if nvidia-smi is found, CUDA is suggested automatically
Ask your hardware preference — [1] NVIDIA GPU (CUDA) or [2] CPU only
Ask your default language — enter any Whisper language code (e.g. en, de, fr); defaults to en
Clone and build whisper.cpp into ~/.local/opt/whisper.cpp (compilation takes ~3–5 minutes)
Download the model — medium for CUDA, small for CPU (best fit for each)
Deploy the waywhisper script to ~/.local/bin/waywhisper with your chosen language and model baked in

Note

The installer is idempotent. Running it again will skip the clone step if whisper.cpp is already present, and rebuilds cleanly to ensure the correct CUDA/CPU configuration.

Manual Setup

If you prefer to set things up yourself:

1. Install dependencies

sudo apt update
sudo apt install -y cmake build-essential git \
  wtype wl-clipboard libnotify-bin libglib2.0-bin \
  pipewire-audio-client-libraries
# NVIDIA GPU only:
sudo apt install -y nvidia-cuda-toolkit

2. Build whisper.cpp

mkdir -p ~/.local/opt && cd ~/.local/opt
git clone https://github.com/ggerganov/whisper.cpp.git
cd whisper.cpp

Pick one — CUDA (NVIDIA GPU) or CPU:

# Option A: CUDA
cmake -B build -DGGML_CUDA=ON
bash ./models/download-ggml-model.sh medium

# Option B: CPU only
cmake -B build
bash ./models/download-ggml-model.sh small

Then build (takes ~3–5 minutes):

cmake --build build -j"$(nproc)" --config Release

3. Clone this repository and install the script

git clone https://github.com/dcwigk/WayWhisper.git
mkdir -p ~/.local/bin
cp WayWhisper/waywhisper ~/.local/bin/waywhisper
chmod +x ~/.local/bin/waywhisper

Important

If you chose CPU only (Option B), update the model path in the installed script:

sed -i 's|ggml-medium\.bin|ggml-small.bin|' ~/.local/bin/waywhisper

Make sure ~/.local/bin is in your PATH.

4. Create the config file

mkdir -p ~/.config/waywhisper
cat > ~/.config/waywhisper/config <<EOF
# Full path to the whisper model
WHISPER_MODEL="$HOME/.local/opt/whisper.cpp/models/ggml-small.bin"

# Character limit before falling back to clipboard paste (default: 800)
# WTYPE_CHAR_LIMIT=800
EOF

Important

Use ggml-medium.bin here if you built with CUDA, ggml-small.bin for CPU.

Usage

Press your shortcut — a spinner notification appears, recording begins
Speak
Press the shortcut again — transcription runs, text is typed into your active window

If no speech is detected, a brief "No speech detected" notification is shown instead.

waywhisper de              # transcribe in German
waywhisper de:en           # speak German, receive English text (translation mode)
waywhisper --profile deen  # same, using a named profile from config
waywhisper cancel          # abort recording without transcribing
waywhisper --help          # show all options (lists defined profiles)

Note

Translation mode only supports output in English, as this is a native constraint of whisper.cpp.

Config File

Create ~/.config/waywhisper/config to set persistent defaults:

# ~/.config/waywhisper/config

WTYPE_CHAR_LIMIT=800                       # character limit before falling back to clipboard
WHISPER_MODEL="/path/to/custom/model.bin"  # override the default model
WHISPER_PROMPT="Dr. Smith, JSON, API"      # vocabulary/style hint passed to whisper-cli

# Named profiles — invoke with: waywhisper --profile <name>
PROFILE_deen="de:en"                       # speak German, receive English
PROFILE_code="en"                          # English, can pair with WHISPER_PROMPT

WHISPER_PROMPT is passed verbatim to whisper.cpp's --prompt flag before each transcription. Use it to nudge the model toward domain-specific spelling, punctuation style, or terminology. It does not filter or rewrite output — it only biases the decoder. Keep prompts short; a handful of key terms or a one-sentence style hint is sufficient.

Profiles map a short name to any lang argument the script already accepts (LANG or LANG:en). They exist solely to give keyboard shortcut bindings a stable, readable command — waywhisper --profile deen is equivalent to waywhisper de:en. Running waywhisper --help lists all profiles currently defined in your config.

Keyboard Shortcuts

Map waywhisper (or waywhisper <lang>) to a shortcut in your desktop environment. Examples:

COSMIC

Settings → Keyboard → Custom Shortcuts → Add Shortcut

Action	Command	Shortcut
Voice to text (English)	`waywhisper en`	`Super+V`
Voice to text (German)	`waywhisper de`	`Super+Shift+V`

Hyprland

Add to ~/.config/hypr/hyprland.conf:

bind = SUPER, V, exec, waywhisper en
bind = SUPER SHIFT, V, exec, waywhisper de

GNOME

Settings → Keyboard → View and Customize Shortcuts → Custom Shortcuts

Action	Command	Shortcut
Voice to text (English)	`waywhisper en`	`Super+V`
Voice to text (German)	`waywhisper de`	`Super+Shift+V`

Tip

You can define as many language shortcuts as you need. WayWhisper accepts any language code supported by Whisper.

How It Works

WayWhisper is a toggle script. The first invocation starts pw-record (PipeWire) in the background and writes a state file. The second invocation signals pw-record to stop, feeds the audio to whisper-cli, and types the text directly with wtype (or falls back to clipboard paste via wl-copy for longer transcriptions). The notification is recycled rather than re-created, and closed instantly over DBus once typing begins.

Inspired by the original hyprflow concept — rewritten for universal Wayland support, CUDA acceleration, and multi-language handling.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

WayWhisper

Features

Prerequisites

Interactive Installer

Manual Setup

1. Install dependencies

2. Build whisper.cpp

3. Clone this repository and install the script

4. Create the config file

Usage

Config File

Keyboard Shortcuts

COSMIC

Hyprland

GNOME

How It Works

About

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
README.md		README.md
setup.sh		setup.sh
waywhisper		waywhisper

Folders and files

Latest commit

History

Repository files navigation

WayWhisper

Features

Prerequisites

Interactive Installer

Manual Setup

1. Install dependencies

2. Build whisper.cpp

3. Clone this repository and install the script

4. Create the config file

Usage

Config File

Keyboard Shortcuts

COSMIC

Hyprland

GNOME

How It Works

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages