Skip to content

Language request: Catalan (ca) #517

@arnaucmd

Description

@arnaucmd

Hi team, first of all Chatterbox is genuinely impressive, the blind test results speak for themselves.

I'd love to see Catalan added to the multilingual model. A few reasons this might be easier than other language requests:

The training data problem is already solved. The Projecte AINA from the Barcelona Supercomputing Center has published high-quality open Catalan speech datasets specifically designed for TTS training, including LaFresCat (studio quality, multi-accent, multiple speakers) and large CommonVoice Catalan subsets. All freely available on Hugging Face.

Catalan has around 10 million speakers and is currently underserved by every major TTS provider. ElevenLabs supports it but no quality open-source alternative does. This would be a meaningful gap to fill.

Would love to know if this is on the roadmap, or if a community fine-tune contribution on top of the existing multilingual model would be a useful path forward.

Thanks for the great work.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions