Draft: experimental UTF-8 support by Kondrashka177 · Pull Request #2417 · cc-tweaked/CC-Tweaked

Kondrashka177 · 2026-04-18T00:34:52Z

Summary

This draft PR is an experiment around UTF-8 support in CC: Tweaked.

It improves several common text-handling paths so that non-ASCII text behaves more naturally in normal terminal-based workflows. I am opening this as a draft because this is not a finished solution, and it is not fully backwards-compatible.

What currently works

This branch improves UTF-8 behaviour in several user-facing places, including:

terminal output
term.write / term.blit
window
read()
edit
monitor rendering
pretty/formatted output
pastebin get/put

Known problems

This approach breaks compatibility with parts of CC's legacy byte-based text model.

In particular:

term.blit can now fail on inputs that previously worked if a byte sequence decodes to fewer Unicode code points than its original byte length
old drawing/charset bytes may render differently when interpreted as UTF-8
legacy non-UTF-8 files may still behave inconsistently in some places

So while this makes UTF-8 text work better in many common cases, it is not a drop-in replacement for the current behaviour.

Limitations

This is not full Unicode support.

It does not properly handle things like:

grapheme clusters
ZWJ sequences
flags
emoji in the general case

The model is still effectively closer to 1 code point = 1 cell.

Why I'm opening this

I do not expect this to be merged as-is. I am opening it as a concrete prototype so the tradeoffs are easier to discuss with real code and test results.

I still hope UTF-8 support in some form remains on the table, even if this specific approach is not suitable for upstream.

zyxkad · 2026-04-18T21:28:05Z

@Kondrashka177 yk you don't have to comment on every single review wojbie made right? They do not give any useful information but boom our emails.

Kondrashka177 · 2026-04-18T21:33:33Z

Sorry:(

Kondrashka177 added 3 commits April 18, 2026 01:26

backup

bd24114

invalid utf8 fix

7557c40

pastebint utf

5cf6cda

This comment was marked as resolved.

Sign in to view

Remove unrelated reverts and leftover test changes

46024d1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Draft: experimental UTF-8 support #2417

Draft: experimental UTF-8 support #2417
Kondrashka177 wants to merge 4 commits intocc-tweaked:mc-1.20.xfrom
Kondrashka177:upstream/utf8-poc

Kondrashka177 commented Apr 18, 2026

Uh oh!

This comment was marked as resolved.

Uh oh!

zyxkad commented Apr 18, 2026

Uh oh!

Kondrashka177 commented Apr 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Kondrashka177 commented Apr 18, 2026

Summary

What currently works

Known problems

Limitations

Why I'm opening this

Uh oh!

This comment was marked as resolved.

Uh oh!

zyxkad commented Apr 18, 2026

Uh oh!

Kondrashka177 commented Apr 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants