Skip to content

Releases: missuo/koe

v1.0.14

08 Apr 08:59
9e2639e

Choose a tag to compare

What's New

Local MLX LLM Provider for Fully Offline Voice Input

Koe now supports running LLM text correction locally via Apple MLX, enabling fully offline voice input without any cloud API. Thanks to @erning! (#66)

DoubaoIME Free ASR Provider

New ASR provider using Doubao IME with automatic device registration — no API key required, no cloud console setup needed. Doubao IME simulates an Android device registration to obtain credentials automatically, making it a truly zero-cost streaming speech recognition option out of the box.

The underlying koe-asr crate is published as a standalone Rust library. You can use it in any Rust project to add free streaming ASR with just a few lines of code:

[dependencies]
koe-asr = { git = "https://github.com/missuo/koe.git" }
use koe_asr::{AsrConfig, AsrEvent, AsrProvider, DoubaoImeProvider};

let mut asr = DoubaoImeProvider::new();
asr.connect(&AsrConfig::default()).await?;
asr.send_audio(&pcm_data).await?;
asr.finish_input().await?;
// Receive Interim / Definite / Final events...

See the koe-asr README for full documentation and examples for all 6 providers.

CLI: Audio/Video File Transcription

koe-cli now includes a transcribe command that converts any audio or video file to text using DoubaoIME free ASR. It uses ffmpeg to decode the input file and streams PCM data to the ASR provider. The CLI is now bundled inside Koe.app and installed as koe automatically via Homebrew.

# Transcribe an audio/video file
koe transcribe recording.mp3

# Show interim results as they arrive
koe transcribe video.mp4 -i

Requires ffmpeg installed on the system.

Koe-Lite Build Variant

Added a lightweight build variant for minimal distribution. Thanks to @erning! (#64)

Bug Fixes

  • Fix random keypresses when quitting after long runtime.
  • Fix earlier segments being lost in DoubaoIME long speech recognition.

Improvements

  • Add koe-asr README with detailed usage examples and API reference.
  • Apply rustfmt formatting across the workspace.
  • Sync README and DESIGN docs with current app behavior.

Install

Homebrew

brew tap owo-network/brew
brew install --cask koe

This installs both Koe.app and the koe CLI command.

Manual Install

  1. Download Koe-macOS-arm64.zip from this release
  2. Unzip and drag Koe.app to /Applications
  3. Launch Koe and let it appear in your menu bar
  4. Open Settings from the menu bar icon to configure ASR and LLM credentials
  5. Optionally symlink the CLI: ln -s /Applications/Koe.app/Contents/MacOS/koe-cli /usr/local/bin/koe

Note: The app is not code-signed. On first launch, right-click → Open, or allow it in System Settings → Privacy & Security.

If macOS still blocks the app, run:

xattr -rd com.apple.quarantine /Applications/Koe.app

v1.0.13

06 Apr 03:04
928318c

Choose a tag to compare

What's New

Apple Speech Provider (macOS 26+)

Koe now supports Apple's on-device Speech framework as an ASR provider. Zero-config, zero-download speech recognition with system-managed language assets. Audio flows through a Swift ↔ Rust FFI bridge following the same pattern as MLX. Setup Wizard includes language picker and asset management UI. Thanks to @erning! (#40)

Custom HTTP Headers for ASR WebSocket

Users can now specify custom HTTP headers in config.yaml for Doubao and Qwen ASR providers. When headers is specified, it fully replaces default provider headers, enabling third-party WebSocket endpoints with different authentication schemes. (#61)

Configurable Reasoning Control for LLM

New no_reasoning_control config option with three modes: reasoning_effort (default, OpenAI o-series), thinking (GLM, Qwen, etc.), and none. This lets users disable reasoning on non-OpenAI models that ignore reasoning_effort and waste time on thinking tokens. Thanks to @erning! (#60)

LLM Test Connection Parity

The wizard's Test Connection button now uses the exact same Rust code path as runtime LLM correction — same prompts, dictionary, timeout, and temperature. Previously used a separate Obj-C implementation that could pass while runtime silently failed. Elapsed time is always reported, even on timeout. Thanks to @erning! (#60)

Bug Fixes

  • Fix repeated accessibility permission prompts; permission menu items are now clickable with direct actions. Thanks to @erning! (#47)
  • Fix clipboard not restoring to empty state after dictation. Thanks to @erning! (#48)
  • Fix state machine race between Rust and ObjC after text delivery, preventing UI flicker. Thanks to @erning! (#50)
  • Fix audio capture error handling: propagate failures, cancel session cleanly, protect error display. Thanks to @erning! (#49)
  • Fix hotkey race window between menu close and quit action. Thanks to @erning! (#62)
  • Fix nested directory handling in model file removal. Thanks to @erning! (#46)
  • Fix spurious empty Interim events in Qwen and Doubao providers. Thanks to @erning! (#55)
  • Fix MLX callback lock and generation check to prevent session crosstalk. Thanks to @erning! (#54)
  • Fix atomic config file writes to prevent corruption on crash. Thanks to @erning! (#53)
  • Fix FeedbackSection serde defaults to match config YAML template. Thanks to @erning! (#52)
  • Fix wizard ASR test URLs to read from config instead of hardcoding. Thanks to @erning! (#59)
  • Redact transcription text from INFO logs for privacy. Thanks to @erning! (#45)
  • Enable system proxy support for model downloads. (#43)
  • Make download cancellation responsive during network issues. (#41, #42)
  • Propagate ASR errors and fail session with partial text. (#33)
  • Match runtime LLM request in wizard test button. (#34)
  • Check config save return values and report failures. (#35)
  • Add generation counter to prevent stale MLX session operations. (#38)
  • Use Rust FFI for resolved hotkey display in status bar. (#37)
  • Surface real ASR errors instead of generic message. (#31)
  • Cache MLX model across sessions to avoid repeated loads. (#30)
  • Skip sha256 check for non-LFS files in manifest pull. (#29)
  • Isolate sessions to prevent old/new interference. (#28)
  • Prevent use-after-free in MLX callback on cancel. (#27)
  • Prevent spurious hotkey triggers after monitor stop on quit.
  • Add missing SystemConfiguration framework linkage.

Improvements

  • Unified model status API with sha256 cache and async UI. (#36)
  • Centralize workspace dependencies and metadata for consistent builds. Thanks to @erning! (#51)
  • Replace line-based YAML parser with serde_yaml.
  • Resolve all clippy errors and warnings. (#32)

Install

Homebrew

brew tap owo-network/brew
brew install --cask koe

Manual Install

  1. Download Koe-macOS-arm64.zip from this release
  2. Unzip and drag Koe.app to /Applications
  3. Launch Koe and let it appear in your menu bar
  4. Open Settings from the menu bar icon to configure ASR and LLM credentials

Note: The app is not code-signed. On first launch, right-click → Open, or allow it in System Settings → Privacy & Security.

If macOS still blocks the app, run:

xattr -rd com.apple.quarantine /Applications/Koe.app

v1.0.12

29 Mar 09:47
8797846

Choose a tag to compare

What's New

Local ASR Support (MLX + sherpa-onnx)

Koe now supports fully offline, on-device speech recognition via MLX Whisper and sherpa-onnx. No cloud credentials required — models are downloaded on first use and stored locally. Supports a wide range of Whisper model sizes. Thanks to @erning for this feature! (#19)

LLM Connection Warmup

Koe now pre-warms the HTTP connection to your LLM provider at the start of each session, hiding TCP/TLS handshake latency behind ASR processing time. This reduces the delay before corrected text appears, especially noticeable with remote providers. Thanks to @hyspace! (#25, #26)

Install

Homebrew

brew tap owo-network/brew
brew install --cask koe

Manual Install

  1. Download Koe-macOS-arm64.zip from this release
  2. Unzip and drag Koe.app to /Applications
  3. Launch Koe and let it appear in your menu bar
  4. Open Settings from the menu bar icon to configure ASR and LLM credentials

Note: The app is not code-signed. On first launch, right-click → Open, or allow it in System Settings → Privacy & Security.

If macOS still blocks the app, run:

xattr -rd com.apple.quarantine /Applications/Koe.app

v1.0.11

28 Mar 13:00
ba44d95

Choose a tag to compare

What's New

Custom Keycode Support for Hotkeys

You can now use any non-modifier key (e.g. F1–F20, Escape, CapsLock) as a trigger or cancel key by specifying its raw macOS keycode in config.yaml. The status bar menu and setup wizard display friendly key names for recognized keycodes. This gives power users full flexibility beyond the 7 built-in modifier key options.

hotkey:
  trigger_key: 96    # F5
  cancel_key: 97     # F6

Qwen ASR Provider Support

Added Qwen (Aliyun DashScope) as a second ASR provider alongside Doubao. Configure it in config.yaml with your DashScope API key. The Qwen ASR URL and model are now fully configurable. Thanks to @nmvr2600! (#22)

Install

Homebrew

brew tap owo-network/brew
brew install --cask koe

Manual Install

  1. Download Koe-macOS-arm64.zip from this release
  2. Unzip and drag Koe.app to /Applications
  3. Launch Koe and let it appear in your menu bar
  4. Open Settings from the menu bar icon to configure ASR and LLM credentials

v1.0.10

27 Mar 11:48
2a11920

Choose a tag to compare

What's New

Faster LLM Correction with HTTP/2 and Connection Reuse

The LLM HTTP client is now shared across voice sessions instead of being rebuilt every time. Combined with HTTP/2 support, this reduces average correction latency by ~20% in real-world benchmarks. The client automatically falls back to HTTP/1.1 when the upstream doesn't support HTTP/2. Thanks to @hyspace for this optimization! (#17)

GPT-5 Reasoning Effort Handling

When using GPT-5-style endpoints (max_completion_tokens), Koe now explicitly sets reasoning_effort: "none" to skip unnecessary reasoning on the latency-sensitive voice correction path. This avoids wasting time and tokens on a task that doesn't need chain-of-thought. Thanks to @hyspace! (#18)

Native Vibrancy Overlay Background

The live transcription overlay pill now uses NSVisualEffectView for its background, giving it a native macOS frosted-glass appearance that adapts to light and dark mode. Thanks to @erning! (#16)

Left and Right Control Key Support

Hotkey configuration now supports both left and right Control keys as trigger or cancel keys. Thanks to @nmvr2600! (#14)

Install

Homebrew

brew tap owo-network/brew
brew install --cask koe

Manual Install

  1. Download Koe-macOS-arm64.zip from this release
  2. Unzip and drag Koe.app to /Applications
  3. Launch Koe and let it appear in your menu bar
  4. Open Settings from the menu bar icon to configure ASR and LLM credentials

v1.0.9

25 Mar 17:33
438aa98

Choose a tag to compare

What's New

Built-in App Updates

Koe now checks for app updates through a JSON update feed and includes a manual Check for Updates action in the menu bar.

Menu Bar Version Info

The menu bar status UI now shows the current version and build number directly, so it is easier to confirm which build is running.

Better Recording Recovery

Fixed microphone disconnect handling during recording. Koe now detects device loss more gracefully and recovers without leaving the app stuck in a broken session.

Provider-Based ASR Settings

ASR configuration has been migrated to a provider-based V2 format, and Settings now includes an ASR Provider selector. Doubao is the current built-in option and remains the default.

Improved Live Transcription Overlay

The interim transcription overlay now auto-wraps long text and avoids the layout jitter that made the text panel feel unstable during recognition.

Configurable Cancel Hotkey

Session cancel is now configurable as a dedicated hotkey separate from the trigger key. Existing configs are normalized automatically so missing or conflicting cancel keys are written back as a valid pair.

Setup Improvements

Refined default voice input settings so new installs come up with more sensible out-of-the-box behavior.

Install

Homebrew

brew tap owo-network/brew
brew install --cask koe

Manual Install

  1. Download Koe-macOS-arm64.zip from this release
  2. Unzip and drag Koe.app to /Applications
  3. Launch Koe and let it appear in your menu bar
  4. Open Settings from the menu bar icon to configure ASR and LLM credentials

v1.0.8

25 Mar 09:05
079a9e8

Choose a tag to compare

What's New

Redesigned Settings UI

The setup wizard has been completely redesigned with a native macOS Settings-style toolbar (like System Settings / Tailscale). Each settings category (ASR, LLM, Hotkey, Dictionary, System Prompt) now has its own pane with SF Symbol icons, smooth animated switching, and auto-resizing.

Secure Key Fields

ASR Access Key and LLM API Key are now masked by default with a toggle eye icon to show/hide — no more credentials visible at a glance.

Token Parameter Configuration

Added a Token Parameter dropdown in LLM settings to choose between max_completion_tokens (GPT-5, o1/o3 reasoning models) and max_tokens (GPT-4o and older). Default changed to max_completion_tokens for modern model compatibility.

Configurable Max Token Parameter

Added max_token_parameter config option to support both legacy and modern OpenAI-compatible APIs. Thanks to @hyspace for this feature! (#7)

Bluetooth Microphone Recovery

Fixed Bluetooth mic reconnect failures (issue #8). Previously, the shared AVAudioEngine held stale device state after a Bluetooth disconnect/reconnect. Now a fresh engine is created per capture session.

Improved Defaults

  • Default LLM base URL: https://api.openai.com/v1
  • Default model: gpt-5.4-nano
  • Default token parameter: max_completion_tokens

Bug Fixes

  • Fixed setup wizard Cmd+C/V not working in text fields
  • Fixed LLM test error messages being truncated
  • Eliminated "Connecting..." overlay delay when ASR latency is high — recording starts immediately while audio buffers during connection

Install

Homebrew

brew tap owo-network/brew
brew install --cask koe

Manual Install

  1. Download Koe-macOS-arm64.zip from this release
  2. Unzip and drag Koe.app to /Applications
  3. Launch Koe — it will appear in your menu bar
  4. Open Settings from the menu bar icon to configure ASR and LLM credentials

v1.0.7

25 Mar 06:32
c6aa702

Choose a tag to compare

What's New

Real-Time ASR Text Overlay

The floating status pill now displays real-time speech recognition text as you speak, so you can see exactly what Koe is hearing. The text updates live during recording, showing the trailing portion when content exceeds the pill width, with a smooth left-edge fade gradient. Thanks to @erning for this feature! (#6)

Setup Wizard

Added a first-launch setup wizard that guides you through configuring ASR credentials, LLM settings, and hotkey preferences — no more editing YAML by hand to get started.

Hotkey Display

The menu bar dropdown now shows which hotkey is currently configured, making it easy to verify your trigger key at a glance.

Microphone Input Device Selection

You can now choose which microphone to use in the config file via audio.input_device, useful when you have multiple audio input devices. Thanks to @erning! (#1)

LLM Toggle

Added llm.enabled option to skip LLM correction entirely, pasting raw ASR text directly — useful for testing or when you just want fast transcription without AI post-processing.

ASR Crate Extraction

The ASR module has been extracted into a standalone koe-asr crate, improving code organization and making it easier to reuse or test independently.

Bug Fixes

  • Fixed overlay freezing when pausing speech for 1–2 seconds. The root cause was that once any utterance was marked as "definite" by the ASR server (after a silence gap), all subsequent responses were misclassified and stopped updating the overlay text.
  • Fixed various audio capture and hotkey detection edge cases.

Install

Homebrew

brew tap owo-network/brew
brew install owo-network/brew/koe

Direct Download

Download Koe-macOS-arm64.zip, unzip, and move Koe.app to /Applications.

Note: The app is not code-signed. On first launch, right-click -> Open, or allow it in System Settings -> Privacy & Security.

If macOS still blocks the app, run:

xattr -rd com.apple.quarantine /Applications/Koe.app

v1.0.6

23 Mar 01:36
a16b3b7

Choose a tag to compare

What's New

Error Notifications

Koe now sends macOS system notifications when errors occur during a session, so you can see exactly what went wrong (e.g. ASR connection timeout, invalid API key) instead of just a brief error icon in the status bar. Notification permission is requested on first launch, and the permission status is displayed in the menu bar dropdown alongside Microphone, Accessibility, and Input Monitoring.

LLM Fallback Warning

When the LLM API call fails (e.g. wrong API key, network issue, rate limit), Koe still pastes the raw ASR text as before — but now sends a notification explaining that LLM correction was skipped and why. This makes it easy to notice when you're running without correction and diagnose the cause.

Status Label Update

The floating status pill now shows "Thinking…" instead of "Correcting…" during the LLM correction phase, better reflecting what the AI is doing.

Bug Fixes

  • Fixed an issue where clicking "Quit Koe" in the menu bar would accidentally trigger the hotkey and start a recording session. The hotkey monitor is now suspended while the status bar menu is open.

Install

Homebrew

brew tap owo-network/brew
brew install owo-network/brew/koe

Direct Download

Download Koe-macOS-arm64.zip, unzip, and move Koe.app to /Applications.

Note: The app is not code-signed. On first launch, right-click -> Open, or allow it in System Settings -> Privacy & Security.

If macOS still blocks the app, run:

xattr -rd com.apple.quarantine /Applications/Koe.app

v1.0.5

22 Mar 17:00
cd6f35c

Choose a tag to compare

What's New

Revamped System Prompt

The default system prompt has been completely rewritten in Chinese to better align with the primary use case. The new prompt design is benchmarked against Shandianshuo's prompt and further optimized:

  • Clear role framing: output is pasted directly at the cursor, never converse with the user
  • Filler word filtering broken out into its own section, distinguishing interjections, verbal fillers, and ASR stuttering repeats, with a "keep when uncertain" safety net
  • Capitalization, CJK spacing, and punctuation rules restructured into independent sections for clarity
  • Prompt text externalized from Rust string literals to standalone .txt files, making it easier to customize without recompiling
  • User prompt template also switched to Chinese

Floating Status Indicator

Added a floating status pill at the bottom center of the screen (above the Dock) that shows real-time session state with animated visuals:

  • Recording: Red waveform bars animation (matching the menu bar icon style)
  • Connecting: Yellow bouncing dots
  • Recognizing / Correcting: Blue bouncing dots
  • Pasting: Green animated checkmark (stroke-by-stroke drawing)
  • Error: Red X mark
  • Auto-hides on idle with fade transitions; does not intercept mouse events

Install

Homebrew

brew tap owo-network/brew
brew install owo-network/brew/koe

Direct Download

Download Koe-macOS-arm64.zip, unzip, and move Koe.app to /Applications.

Note: The app is not code-signed. On first launch, right-click -> Open, or allow it in System Settings -> Privacy & Security.

If macOS still blocks the app, run:

xattr -rd com.apple.quarantine /Applications/Koe.app