Skip to content

Zbrooklyn/WhisperClick-Desktop-App

Repository files navigation

WhisperClick

WhisperClick

Talk instead of type. Anywhere on your desktop.
One hotkey. Instant transcription. Pasted right where your cursor is.

Download · Website · Feedback


You talk 3-4x faster than you type. That speed gap costs you hours every week — in emails, Slack messages, docs, code comments, and every other text field on your screen.

WhisperClick closes that gap. Press a hotkey, say what you're thinking, and the transcribed text appears at your cursor. Done. No copying, no pasting, no switching windows. It works in every app on your desktop.

WhisperClick dark mode

How it works

  1. Press Ctrl+Alt+R from any app (customizable)
  2. Talk naturally
  3. Press the hotkey again to stop
  4. Text appears at your cursor, already pasted

That's the whole workflow. There's no app to switch to, no text to copy. You talk, it types.

The floating pill

A small pill sits at the edge of your screen while recording. It shows live audio bars so you know it's listening, and has cancel/stop controls if you need them.

WhisperClick pill showing hotkey hint     WhisperClick pill recording with audio bars

Right-click the pill for quick access to history, settings, or to hide it entirely. When you're not recording, it shrinks to a tiny capsule that stays out of your way.

Download

Platform Architecture Package Status
Windows x64 (64-bit) Setup Installer (.exe) Stable — recommended
Portable (.exe) Stable — no install needed
macOS Apple Silicon (M1/M2/M3/M4) DMG (arm64) Early access
Intel (2015–2020 Macs) DMG (x64) Early access
Linux x64 (64-bit) AppImage Early access

All downloads are on the Releases page. The app auto-updates after you install — you only need to download once.

Not sure which Mac you have? Click Apple menu > About This Mac. If it says M1, M2, M3, or M4, grab the arm64 DMG. Otherwise grab the x64 DMG.

macOS and Linux: Core recording and transcription work on all platforms. Some features like auto-paste and system tray behavior may vary as we continue testing. Let us know what works and what doesn't — it genuinely helps.

Getting started

  1. Run the installer (or the portable EXE — your choice)
  2. WhisperClick opens and walks you through setup
  3. Pick a transcription provider and paste your API key
  4. Press Ctrl+Alt+R and start talking

Setup takes about 60 seconds.

What you get

  • Global hotkey from any app. Email, Slack, VS Code, Google Docs, a terminal — wherever your cursor is. One hotkey triggers recording and pastes the result.
  • Auto-paste at cursor. No clipboard dance. Text lands exactly where you were typing.
  • Floating pill indicator. Shows recording state and live audio levels. Right-click for history, settings, and quick controls.
  • 50+ languages. Auto-detection, or pick a specific language. Translate on the fly — speak in one language, get text in another.
  • Searchable history. Every transcription is saved with the original audio for playback. Search, copy, export, or replay anything.
  • Audio visualizer. 8 visual styles, 3 motion presets, 4 density levels. Make it yours.
  • Dark and light themes. Follows your system, or set it manually.
  • System tray. Lives in your tray quietly. Right-click for quick controls, recent transcriptions, and settings.
  • Auto-updates. New versions download in the background. Click restart when you're ready.

WhisperClick light mode

Transcription providers

WhisperClick uses cloud APIs for fast, accurate transcription. You'll need an API key from one of these providers:

Provider What you get Get a key
OpenAI GPT-4o Transcribe, GPT-4o Mini Transcribe, Whisper platform.openai.com/api-keys
Google Gemini Gemini 2.5 Flash, 2.5 Pro, and newer models aistudio.google.com/apikey

Both providers offer free tiers or low-cost usage. A typical user's monthly cost is under $1.

Your API key is encrypted at rest using your operating system's secure keychain — never stored in plain text.

Privacy

No telemetry. No analytics. No background network calls. No data collection of any kind.

  • Audio goes to your chosen provider only when you press the hotkey. Nothing is sent otherwise.
  • Nothing is stored on any server after transcription returns.
  • There is no always-on microphone. Recording starts when you press the hotkey and stops when you press it again.

You pick when it listens. Full details in PRIVACY.md.


For developers

Build from source, run tests, architecture overview

Build from source

git clone https://github.com/Zbrooklyn/WhisperClick-Desktop-App.git
cd WhisperClick-Desktop-App
npm install
pip install -r shared/engine/requirements.txt
npm start

Build installers

npm run dist:win     # Windows (NSIS + portable)
npm run dist:mac     # macOS (DMG)
npm run dist:linux   # Linux (AppImage)

Run tests

# Electron (Jest)
npm test             # 412 tests
npm run test:unit    # Unit tests
npm run test:e2e     # End-to-end tests

# Tauri (Rust)
cd platforms/tauri && cargo test   # 518 tests

How it's built

WhisperClick is a desktop app with a Python sidecar that handles audio recording and transcription. The frontend is a single HTML file with Tailwind CSS — no React, no build step, no framework overhead. It ships on two platforms: Electron (stable, current releases) and Tauri (Rust-based, lighter footprint).

platforms/electron/    Electron main process (Node.js)
  main.js              Window management, IPC, hotkey, tray
  sidecar.js           Python engine manager (JSON over stdin/stdout)
  store.js             Settings and history persistence
  updater.js           Auto-update (GitHub releases)

platforms/tauri/       Tauri platform (Rust + WebView)
  src-tauri/           Rust backend (commands, sidecar bridge, tray)
  src/                 Tauri-specific frontend wiring

shared/frontend/       Renderer (shared across platforms)
  index.html           Full UI (HTML + Tailwind CSS + inline JS)

shared/pill/           Floating widget (shared)
  pill.html            Always-on-top recording capsule

shared/engine/         Python sidecar
  engine.py            Audio capture, transcription, model management

Feedback and bugs

Found something broken? Have an idea? Open an issue. We read every one.

License

Source-available under CC BY-NC-SA 4.0. Free for personal and non-commercial use.

Security

Found a vulnerability? See SECURITY.md for responsible disclosure.

About

Talk instead of type. WhisperClick turns your voice into text anywhere on Windows, macOS, and Linux — one hotkey, instant paste.

Topics

Resources

License

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors