Skip to content

Feature Request: Chinese Language Support — Traditional Script (zh-TW) + Refine Output Language #754

@TreeCat506

Description

@TreeCat506

Feature Request: Chinese Language Support — Traditional Script (zh-TW) + Refine Output Language

First, a sincere thank you 🙏

Voicebox is genuinely impressive — a local-first, privacy-respecting tool that combines TTS and STT in one app. The Captures tab, voice cloning quality, and MCP integration are all really well thought out. Thank you for building and open-sourcing this.


Two Related Issues for Mandarin Users

Both issues affect users transcribing Mandarin audio (e.g. from Taiwan or Hong Kong).


Issue 1: Whisper outputs Simplified Chinese (zh-Hans) with no option for Traditional (zh-TW)

When transcribing Mandarin audio in the Captures tab, Whisper outputs Simplified Chinese by default — even when the speaker uses Traditional Chinese. There is currently no way to get Traditional Chinese script output.

Proposed solution: Add a script/locale option to the Captures transcription settings:

  • Simplified Chinese (zh-Hans) — current default
  • Traditional Chinese / Taiwan (zh-TW)
  • Traditional Chinese / Hong Kong (zh-HK)

This could be implemented by post-processing Whisper output with OpenCC (s2twp converter for Taiwan Traditional). It's a well-maintained Python library and would be a lightweight addition to the existing pipeline.


Issue 2: Refine (LLM) outputs English when the transcript is Chinese

After transcription, running the Refine function on a Chinese transcript appears to output the result in English instead of preserving the original language. This makes the Refine feature unusable for non-English speakers.

This seems to be caused by the Refine prompt being written in English, which causes Qwen3 to default to English output.

Proposed solution: Make the Refine LLM prompt language-aware — either by detecting the transcript language automatically and responding in the same language, or by adding a language preference setting under Settings → Captures → Refinement.


Who Would Benefit

Users from Taiwan, Hong Kong, Macau, and overseas Chinese communities — a significant portion of potential users in the Asia-Pacific region.

Environment

  • Platform: Windows10
  • Voicebox version: v0.5.0
  • STT engine: Whisper Turbo

Happy to contribute a PR or help test if this direction sounds good to the maintainers!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions