GitHub - kestavanik/hotkey-whisper: Voice-to-text with a single hotkey. Press a key, speak, text appears. Local & private using OpenAI Whisper.

Voice-to-text with a single hotkey
Press a key, speak, text appears. Local & private.

Hotkey Whisper runs OpenAI's Whisper model 100% locally on your GPU. No cloud. No subscription. No data leaving your machine. Just press a hotkey, speak, and watch your words appear wherever your cursor is.

How It Works

Press hotkey → starts recording, text appears as you speak
Speak naturally → words are transcribed in realtime
Stop talking → automatically stops after 5 seconds of silence
Server auto-stops after 5 minutes of inactivity (saves GPU memory)

Requirements

Ubuntu 22.04+ (other distros may work)
NVIDIA GPU with 4GB+ VRAM
CUDA drivers installed

Installation

git clone https://github.com/kestavanik/hotkey-whisper
cd hotkey-whisper
./install.sh

Then set up a keyboard shortcut:

Settings → Keyboard → Custom Shortcuts
Add new shortcut with command: hotkey-whisper
Assign your preferred key (I use Ctrl+Alt+W)

Usage

hotkey-whisper              # Toggle recording (bind to hotkey)
hotkey-whisper status       # Check server status
hotkey-whisper stop         # Shutdown server
hotkey-whisper log          # Watch server logs

First run downloads the Whisper model (~1.5GB) and takes ~30 seconds. After that, it's instant.

Configuration

Edit ~/.config/hotkey-whisper/config.json to customize settings:

{
  "model": "base",
  "language": "en",
  "device": "cuda",
  "compute_type": "float16",
  "idle_timeout": 300,
  "silence_timeout": 5,
  "silero_sensitivity": 0.4,
  "post_speech_silence_duration": 0.5,
  "min_length_of_recording": 0.3,
  "realtime_processing_pause": 0.1,
  "typing_delay": 10
}

Settings

Setting	Default	Description
`model`	`"base"`	Whisper model: `tiny`, `base`, `small`, `medium`, `large-v3`
`language`	`"en"`	Language code (e.g., `en`, `es`, `fr`, `de`)
`device`	`"cuda"`	`cuda` for GPU, `cpu` for CPU-only
`compute_type`	`"float16"`	`float16` for speed, `float32` for accuracy
`idle_timeout`	`300`	Seconds before server auto-stops (0 = never)
`silence_timeout`	`5`	Seconds of silence before recording auto-stops
`typing_delay`	`10`	Milliseconds between keystrokes

Model Comparison

Model	VRAM	Speed	Accuracy
tiny	~1GB	Fastest	Basic
base	~1GB	Fast	Good
small	~2GB	Medium	Better
medium	~5GB	Slower	Great
large-v3	~10GB	Slowest	Best

Uninstall

./uninstall.sh

Then remove your keyboard shortcut from Settings.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
assets		assets
lib		lib
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
config.json		config.json
install.sh		install.sh
main.sh		main.sh
uninstall.sh		uninstall.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

How It Works

Requirements

Installation

Usage

Configuration

Settings

Model Comparison

Uninstall

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

How It Works

Requirements

Installation

Usage

Configuration

Settings

Model Comparison

Uninstall

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages