Skip to content

Draft: experimental UTF-8 support #2417

Draft
Kondrashka177 wants to merge 4 commits intocc-tweaked:mc-1.20.xfrom
Kondrashka177:upstream/utf8-poc
Draft

Draft: experimental UTF-8 support #2417
Kondrashka177 wants to merge 4 commits intocc-tweaked:mc-1.20.xfrom
Kondrashka177:upstream/utf8-poc

Conversation

@Kondrashka177
Copy link
Copy Markdown

Summary

This draft PR is an experiment around UTF-8 support in CC: Tweaked.

It improves several common text-handling paths so that non-ASCII text behaves more naturally in normal terminal-based workflows. I am opening this as a draft because this is not a finished solution, and it is not fully backwards-compatible.

What currently works

This branch improves UTF-8 behaviour in several user-facing places, including:

  • terminal output
  • term.write / term.blit
  • window
  • read()
  • edit
  • monitor rendering
  • pretty/formatted output
  • pastebin get/put

Known problems

This approach breaks compatibility with parts of CC's legacy byte-based text model.

In particular:

  • term.blit can now fail on inputs that previously worked if a byte sequence decodes to fewer Unicode code points than its original byte length
  • old drawing/charset bytes may render differently when interpreted as UTF-8
  • legacy non-UTF-8 files may still behave inconsistently in some places

So while this makes UTF-8 text work better in many common cases, it is not a drop-in replacement for the current behaviour.

Limitations

This is not full Unicode support.

It does not properly handle things like:

  • grapheme clusters
  • ZWJ sequences
  • flags
  • emoji in the general case

The model is still effectively closer to 1 code point = 1 cell.

Why I'm opening this

I do not expect this to be merged as-is. I am opening it as a concrete prototype so the tradeoffs are easier to discuss with real code and test results.

I still hope UTF-8 support in some form remains on the table, even if this specific approach is not suitable for upstream.

Wojbie

This comment was marked as resolved.

@zyxkad
Copy link
Copy Markdown
Contributor

zyxkad commented Apr 18, 2026

@Kondrashka177 yk you don't have to comment on every single review wojbie made right? They do not give any useful information but boom our emails.

@Kondrashka177
Copy link
Copy Markdown
Author

Sorry:(

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants