mtl_tokenizer.json missing Persian characters: آ (U+0622), أ (U+0623), إ (U+0625)

<html>
<body>
<p style="white-space: pre-wrap; margin-top: 0.1em; margin-bottom: 0.2em; color: rgb(204, 204, 204); font-family: -apple-system, BlinkMacSystemFont, &quot;Segoe UI&quot;, Roboto, sans-serif; font-size: 13px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(37, 37, 38); text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;">Summary<p style="white-space: pre-wrap; margin-top: 0.1em; margin-bottom: 0.2em; color: rgb(204, 204, 204); font-family: -apple-system, BlinkMacSystemFont, &quot;Segoe UI&quot;, Roboto, sans-serif; font-size: 13px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(37, 37, 38); text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;">The Persian fine-tune tokenizer (<code style="font-family: monospace; color: rgb(215, 186, 125); background-color: rgba(255, 255, 255, 0.1); padding: 2px 4px; border-radius: 3px; word-break: break-word; font-size: 0.9em;">mtl_tokenizer.json</code>, 2352 BPE tokens) is missing three alef variants required for Persian text. This causes the model to produce silence or garbled output for any word containing these characters. The most critical is آ (ALEF WITH MADDA ABOVE, U+0622) — it appears in hundreds of common Persian words as the word-initial long /ɒː/ vowel.<hr style="font-family: -apple-system, BlinkMacSystemFont, &quot;Segoe UI&quot;, Roboto, sans-serif; font-size: 13px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; white-space: normal; background-color: rgb(37, 37, 38); text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;"><p style="white-space: pre-wrap; margin-top: 0.1em; margin-bottom: 0.2em; color: rgb(204, 204, 204); font-family: -apple-system, BlinkMacSystemFont, &quot;Segoe UI&quot;, Roboto, sans-serif; font-size: 13px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(37, 37, 38); text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;">Impact
Input | Expected | Actual
-- | -- | --
آب (āb = water) | /ɒːb/ | silence + "b"
آسمان (āsemān = sky) | /ɒːsemɒːn/ | only "s(e)mān"
آمدن (āmadan = to come) | /ɒːmædæn/ | "m(a)d(a)n"
آن (ān = that) | /ɒːn/ | silence

<p style="white-space: pre-wrap; margin-top: 0.1em; margin-bottom: 0.2em; color: rgb(204, 204, 204); font-family: -apple-system, BlinkMacSystemFont, &quot;Segoe UI&quot;, Roboto, sans-serif; font-size: 13px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(37, 37, 38); text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;">Interestingly, the v3/multilingual grapheme tokenizer (<code style="font-family: monospace; color: rgb(215, 186, 125); background-color: rgba(255, 255, 255, 0.1); padding: 2px 4px; border-radius: 3px; word-break: break-word; font-size: 0.9em;">grapheme_mtl_merged_expanded_v1.json</code>, 2454 tokens) does have all three: آ=idx 2356, أ=idx 2353, إ=idx 2354. The model's <code style="font-family: monospace; color: rgb(215, 186, 125); background-color: rgba(255, 255, 255, 0.1); padding: 2px 4px; border-radius: 3px; word-break: break-word; font-size: 0.9em;">text_emb</code> is [2454, 1024] (from <code style="font-family: monospace; color: rgb(215, 186, 125); background-color: rgba(255, 255, 255, 0.1); padding: 2px 4px; border-radius: 3px; word-break: break-word; font-size: 0.9em;">T3Config.multilingual()</code>), so the embeddings for these characters already exist in the weight matrix — they're just unreachable from the Persian subword tokenizer.<hr style="font-family: -apple-system, BlinkMacSystemFont, &quot;Segoe UI&quot;, Roboto, sans-serif; font-size: 13px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; white-space: normal; background-color: rgb(37, 37, 38); text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;"><p style="white-space: pre-wrap; margin-top: 0.1em; margin-bottom: 0.2em; color: rgb(204, 204, 204); font-family: -apple-system, BlinkMacSystemFont, &quot;Segoe UI&quot;, Roboto, sans-serif; font-size: 13px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(37, 37, 38); text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;">What we tried<ol style="padding-inline-start: 2em; color: rgb(204, 204, 204); font-family: -apple-system, BlinkMacSystemFont, &quot;Segoe UI&quot;, Roboto, sans-serif; font-size: 13px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; white-space: normal; background-color: rgb(37, 37, 38); text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;"><li>Embedding expansion — Added new token IDs 2352–2354, expanded <code style="font-family: monospace; color: rgb(215, 186, 125); background-color: rgba(255, 255, 255, 0.1); padding: 2px 4px; border-radius: 3px; word-break: break-word; font-size: 0.9em;">text_emb</code> to [2357, 1024], initialized new rows from alef (idx 1456). Failed: The model's <code style="font-family: monospace; color: rgb(215, 186, 125); background-color: rgba(255, 255, 255, 0.1); padding: 2px 4px; border-radius: 3px; word-break: break-word; font-size: 0.9em;">text_emb</code> is actually [2454, 1024], so our new indices fell inside existing rows (idx 2352 = € Euro sign embedding). Result: آ pronounced as random character.</li><li>Token injection at v3 positions — Mapped آ→idx 2356 directly in the patched tokenizer (the v3 spot where the model already has an آ embedding). Failed: Adding tokens to a BPE vocab without updating the merge table causes the tokenizer to fragment the token into random subword pieces. Result: آ pronounced as "م و ش".</li></ol><hr style="font-family: -apple-system, BlinkMacSystemFont, &quot;Segoe UI&quot;, Roboto, sans-serif; font-size: 13px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; white-space: normal; background-color: rgb(37, 37, 38); text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;"><p style="white-space: pre-wrap; margin-top: 0.1em; margin-bottom: 0.2em; color: rgb(204, 204, 204); font-family: -apple-system, BlinkMacSystemFont, &quot;Segoe UI&quot;, Roboto, sans-serif; font-size: 13px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(37, 37, 38); text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;">Workaround we settled on<p style="white-space: pre-wrap; margin-top: 0.1em; margin-bottom: 0.2em; color: rgb(204, 204, 204); font-family: -apple-system, BlinkMacSystemFont, &quot;Segoe UI&quot;, Roboto, sans-serif; font-size: 13px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(37, 37, 38); text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;">Decompose the precomposed Unicode character into its constituent parts, both of which exist in the tokenizer:<div class="codeBlockWrapper_-a7MRw" style="position: relative; margin: 8px 0px; color: rgb(204, 204, 204); font-family: -apple-system, BlinkMacSystemFont, &quot;Segoe UI&quot;, Roboto, sans-serif; font-size: 13px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; white-space: normal; background-color: rgb(37, 37, 38); text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;"><button class="copyButton_CEmTFw copyButton_-a7MRw" title="Copy code" aria-label="Copy code to clipboard" style="color: rgb(204, 204, 204); font-family: -apple-system, BlinkMacSystemFont, &quot;Segoe UI&quot;, Roboto, sans-serif; font-size: 13px; background: none 0% 0% / auto repeat scroll padding-box border-box rgb(30, 30, 30); border-color: rgba(204, 204, 204, 0.2); border-style: solid; border-width: 1px; border-image: none 100% / 1 / 0 stretch; cursor: pointer; opacity: 0; display: flex; border-radius: 4px; justify-content: center; align-items: center; padding: 4px; transition: opacity 0.15s, background 0.15s; position: absolute; top: 4px; right: 4px;"><svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 20 20" fill="currentColor" aria-hidden="true" data-slot="icon" class="copyIcon_CEmTFw"><path fill-rule="evenodd" d="M15.988 3.012A2.25 2.25 0 0 1 18 5.25v6.5A2.25 2.25 0 0 1 15.75 14H13.5v-3.379a3 3 0 0 0-.879-2.121l-3.12-3.121a3 3 0 0 0-1.402-.791 2.252 2.252 0 0 1 1.913-1.576A2.25 2.25 0 0 1 12.25 1h1.5a2.25 2.25 0 0 1 2.238 2.012ZM11.5 3.25a.75.75 0 0 1 .75-.75h1.5a.75.75 0 0 1 .75.75v.25h-3v-.25Z" clip-rule="evenodd"></path><path d="M3.5 6A1.5 1.5 0 0 0 2 7.5v9A1.5 1.5 0 0 0 3.5 18h7a1.5 1.5 0 0 0 1.5-1.5v-5.879a1.5 1.5 0 0 0-.44-1.06L8.44 6.439A1.5 1.5 0 0 0 7.378 6H3.5Z"></path></svg></button><pre style="overflow-x: auto; white-space: pre; box-sizing: border-box; border-radius: 4px; max-width: 100%; margin: 0px; padding: 8px;"><code style="font-family: monospace; color: rgb(215, 186, 125); background-color: rgba(255, 255, 255, 0.1); padding: 0px; border-radius: 3px; word-break: break-word; font-size: 0.9em;">آ (U+0622) = ا (U+0627 ALEF) + ٓ (U+0653 MADDAH ABOVE)
</code></pre></div><p style="white-space: pre-wrap; margin-top: 0.1em; margin-bottom: 0.2em; color: rgb(204, 204, 204); font-family: -apple-system, BlinkMacSystemFont, &quot;Segoe UI&quot;, Roboto, sans-serif; font-size: 13px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(37, 37, 38); text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;">Both are in the tokenizer: ا at idx 1456, MADDAH ABOVE at idx 1457. The decomposition is applied as a text-level replacement before tokenization:<div class="codeBlockWrapper_-a7MRw" style="position: relative; margin: 8px 0px; color: rgb(204, 204, 204); font-family: -apple-system, BlinkMacSystemFont, &quot;Segoe UI&quot;, Roboto, sans-serif; font-size: 13px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; white-space: normal; background-color: rgb(37, 37, 38); text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;"><button class="copyButton_CEmTFw copyButton_-a7MRw" title="Copy code" aria-label="Copy code to clipboard" style="color: rgb(204, 204, 204); font-family: -apple-system, BlinkMacSystemFont, &quot;Segoe UI&quot;, Roboto, sans-serif; font-size: 13px; background: none 0% 0% / auto repeat scroll padding-box border-box rgb(30, 30, 30); border-color: rgba(204, 204, 204, 0.2); border-style: solid; border-width: 1px; border-image: none 100% / 1 / 0 stretch; cursor: pointer; opacity: 0; display: flex; border-radius: 4px; justify-content: center; align-items: center; padding: 4px; transition: opacity 0.15s, background 0.15s; position: absolute; top: 4px; right: 4px;"><svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 20 20" fill="currentColor" aria-hidden="true" data-slot="icon" class="copyIcon_CEmTFw"><path fill-rule="evenodd" d="M15.988 3.012A2.25 2.25 0 0 1 18 5.25v6.5A2.25 2.25 0 0 1 15.75 14H13.5v-3.379a3 3 0 0 0-.879-2.121l-3.12-3.121a3 3 0 0 0-1.402-.791 2.252 2.252 0 0 1 1.913-1.576A2.25 2.25 0 0 1 12.25 1h1.5a2.25 2.25 0 0 1 2.238 2.012ZM11.5 3.25a.75.75 0 0 1 .75-.75h1.5a.75.75 0 0 1 .75.75v.25h-3v-.25Z" clip-rule="evenodd"></path><path d="M3.5 6A1.5 1.5 0 0 0 2 7.5v9A1.5 1.5 0 0 0 3.5 18h7a1.5 1.5 0 0 0 1.5-1.5v-5.879a1.5 1.5 0 0 0-.44-1.06L8.44 6.439A1.5 1.5 0 0 0 7.378 6H3.5Z"></path></svg></button><pre style="overflow-x: auto; white-space: pre; box-sizing: border-box; border-radius: 4px; max-width: 100%; margin: 0px; padding: 8px;"><code class="language-python" style="font-family: monospace; color: rgb(215, 186, 125); background-color: rgba(255, 255, 255, 0.1); padding: 0px; border-radius: 3px; word-break: break-word; font-size: 0.9em;">text = text.replace("آ", "آ") # ALEF MADDA → ALEF + MADDAH ABOVE
text = text.replace("أ", "ا") # ALEF HAMZA → ALEF
text = text.replace("إ", "ا") # ALEF HAMZA BELOW → ALEF
</code></pre></div><p style="white-space: pre-wrap; margin-top: 0.1em; margin-bottom: 0.2em; color: rgb(204, 204, 204); font-family: -apple-system, BlinkMacSystemFont, &quot;Segoe UI&quot;, Roboto, sans-serif; font-size: 13px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(37, 37, 38); text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;">This works because the model was trained on diacritized Persian text and understands that ALEF + MADDAH ABOVE = long /ɒː/. The result has the correct vowel quality (matching the ا in صالح).<hr style="font-family: -apple-system, BlinkMacSystemFont, &quot;Segoe UI&quot;, Roboto, sans-serif; font-size: 13px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; white-space: normal; background-color: rgb(37, 37, 38); text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;"><p style="white-space: pre-wrap; margin-top: 0.1em; margin-bottom: 0.2em; color: rgb(204, 204, 204); font-family: -apple-system, BlinkMacSystemFont, &quot;Segoe UI&quot;, Roboto, sans-serif; font-size: 13px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(37, 37, 38); text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;">Suggested proper fix<ol style="padding-inline-start: 2em; color: rgb(204, 204, 204); font-family: -apple-system, BlinkMacSystemFont, &quot;Segoe UI&quot;, Roboto, sans-serif; font-size: 13px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; white-space: normal; background-color: rgb(37, 37, 38); text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;"><li>Add آ, أ, إ to <code style="font-family: monospace; color: rgb(215, 186, 125); background-color: rgba(255, 255, 255, 0.1); padding: 2px 4px; border-radius: 3px; word-break: break-word; font-size: 0.9em;">mtl_tokenizer.json</code> with proper BPE merge rules trained on Persian text</li><li>The model's <code style="font-family: monospace; color: rgb(215, 186, 125); background-color: rgba(255, 255, 255, 0.1); padding: 2px 4px; border-radius: 3px; word-break: break-word; font-size: 0.9em;">text_emb</code> is already [2454, 1024] with trained embeddings at positions 2353, 2354, 2356 — these should be usable as-is</li><li>The merge table needs updating so the tokenizer can correctly tokenize words containing these characters as single coherent subwords rather than fragmenting them</li></ol><hr style="font-family: -apple-system, BlinkMacSystemFont, &quot;Segoe UI&quot;, Roboto, sans-serif; font-size: 13px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; white-space: normal; background-color: rgb(37, 37, 38); text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;"><p style="white-space: pre-wrap; margin-top: 0.1em; margin-bottom: 0.2em; color: rgb(204, 204, 204); font-family: -apple-system, BlinkMacSystemFont, &quot;Segoe UI&quot;, Roboto, sans-serif; font-size: 13px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(37, 37, 38); text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;">Versions<ul style="padding-inline-start: 2em; color: rgb(204, 204, 204); font-family: -apple-system, BlinkMacSystemFont, &quot;Segoe UI&quot;, Roboto, sans-serif; font-size: 13px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; white-space: normal; background-color: rgb(37, 37, 38); text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;"><li>Model: <code style="font-family: monospace; color: rgb(215, 186, 125); background-color: rgba(255, 255, 255, 0.1); padding: 2px 4px; border-radius: 3px; word-break: break-word; font-size: 0.9em;">hootan09/ChatterBox</code> → <code style="font-family: monospace; color: rgb(215, 186, 125); background-color: rgba(255, 255, 255, 0.1); padding: 2px 4px; border-radius: 3px; word-break: break-word; font-size: 0.9em;">t3_fa.safetensors</code></li><li>Tokenizer: <code style="font-family: monospace; color: rgb(215, 186, 125); background-color: rgba(255, 255, 255, 0.1); padding: 2px 4px; border-radius: 3px; word-break: break-word; font-size: 0.9em;">mtl_tokenizer.json</code> (2352 tokens)</li><li>chatterbox-tts: 0.1.7</li><li>Tested on: RTX 5060 Ti (16 GB), CUDA 12.x</li></ul><hr style="font-family: -apple-system, BlinkMacSystemFont, &quot;Segoe UI&quot;, Roboto, sans-serif; font-size: 13px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; white-space: normal; background-color: rgb(37, 37, 38); text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;"><p style="white-space: pre-wrap; margin-top: 0.1em; margin-bottom: 0.2em; color: rgb(204, 204, 204); font-family: -apple-system, BlinkMacSystemFont, &quot;Segoe UI&quot;, Roboto, sans-serif; font-size: 13px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(37, 37, 38); text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;">Note: The same fix applies to the Thomcles mirror (<code style="font-family: monospace; color: rgb(215, 186, 125); background-color: rgba(255, 255, 255, 0.1); padding: 2px 4px; border-radius: 3px; word-break: break-word; font-size: 0.9em;">Thomcles/Chatterbox-TTS-Persian-Farsi</code>) — the files are MD5-identical.
</body>
</html>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mtl_tokenizer.json missing Persian characters: آ (U+0622), أ (U+0623), إ (U+0625) #527

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

mtl_tokenizer.json missing Persian characters: آ (U+0622), أ (U+0623), إ (U+0625) #527

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions