Feature Request: Chinese Language Support — Traditional Script (zh-TW) + Refine Output Language
First, a sincere thank you 🙏
Voicebox is genuinely impressive — a local-first, privacy-respecting tool that combines TTS and STT in one app. The Captures tab, voice cloning quality, and MCP integration are all really well thought out. Thank you for building and open-sourcing this.
Two Related Issues for Mandarin Users
Both issues affect users transcribing Mandarin audio (e.g. from Taiwan or Hong Kong).
Issue 1: Whisper outputs Simplified Chinese (zh-Hans) with no option for Traditional (zh-TW)
When transcribing Mandarin audio in the Captures tab, Whisper outputs Simplified Chinese by default — even when the speaker uses Traditional Chinese. There is currently no way to get Traditional Chinese script output.
Proposed solution: Add a script/locale option to the Captures transcription settings:
- Simplified Chinese (zh-Hans) — current default
- Traditional Chinese / Taiwan (zh-TW)
- Traditional Chinese / Hong Kong (zh-HK)
This could be implemented by post-processing Whisper output with OpenCC (s2twp converter for Taiwan Traditional). It's a well-maintained Python library and would be a lightweight addition to the existing pipeline.
Issue 2: Refine (LLM) outputs English when the transcript is Chinese
After transcription, running the Refine function on a Chinese transcript appears to output the result in English instead of preserving the original language. This makes the Refine feature unusable for non-English speakers.
This seems to be caused by the Refine prompt being written in English, which causes Qwen3 to default to English output.
Proposed solution: Make the Refine LLM prompt language-aware — either by detecting the transcript language automatically and responding in the same language, or by adding a language preference setting under Settings → Captures → Refinement.
Who Would Benefit
Users from Taiwan, Hong Kong, Macau, and overseas Chinese communities — a significant portion of potential users in the Asia-Pacific region.
Environment
- Platform: Windows10
- Voicebox version: v0.5.0
- STT engine: Whisper Turbo
Happy to contribute a PR or help test if this direction sounds good to the maintainers!
Feature Request: Chinese Language Support — Traditional Script (zh-TW) + Refine Output Language
First, a sincere thank you 🙏
Voicebox is genuinely impressive — a local-first, privacy-respecting tool that combines TTS and STT in one app. The Captures tab, voice cloning quality, and MCP integration are all really well thought out. Thank you for building and open-sourcing this.
Two Related Issues for Mandarin Users
Both issues affect users transcribing Mandarin audio (e.g. from Taiwan or Hong Kong).
Issue 1: Whisper outputs Simplified Chinese (zh-Hans) with no option for Traditional (zh-TW)
When transcribing Mandarin audio in the Captures tab, Whisper outputs Simplified Chinese by default — even when the speaker uses Traditional Chinese. There is currently no way to get Traditional Chinese script output.
Proposed solution: Add a script/locale option to the Captures transcription settings:
This could be implemented by post-processing Whisper output with OpenCC (
s2twpconverter for Taiwan Traditional). It's a well-maintained Python library and would be a lightweight addition to the existing pipeline.Issue 2: Refine (LLM) outputs English when the transcript is Chinese
After transcription, running the Refine function on a Chinese transcript appears to output the result in English instead of preserving the original language. This makes the Refine feature unusable for non-English speakers.
This seems to be caused by the Refine prompt being written in English, which causes Qwen3 to default to English output.
Proposed solution: Make the Refine LLM prompt language-aware — either by detecting the transcript language automatically and responding in the same language, or by adding a language preference setting under Settings → Captures → Refinement.
Who Would Benefit
Users from Taiwan, Hong Kong, Macau, and overseas Chinese communities — a significant portion of potential users in the Asia-Pacific region.
Environment
Happy to contribute a PR or help test if this direction sounds good to the maintainers!