perf: rotate line slices in Insert/DeleteLineArea full-width fast path by LXXero · Pull Request #119 · charmbracelet/ultraviolet

LXXero · 2026-06-06T11:07:01Z

Problem

Buffer.DeleteLineArea (the screen-scroll path — every newline at the bottom of the scroll region) shifts cells one struct assignment at a time: O(rows×cols) Cell copies per scrolled line. Profiling a real terminal emulator (xerotty) flooding seq 1 2000000 into an 80×24 screen, this single function + the memmoves it generates was ~80% of total process CPU (perf, Linux). At ~100 bytes per Cell that's hundreds of gigabytes moved to scroll two million lines.

Change

Full-width areas (the overwhelmingly common case — whole-screen scroll regions): rotate the line slice headers instead of copying cells — O(rows) slice-header moves, with the displaced lines' storage recycled as the cleared lines. Same trick for InsertLineArea (scroll down).
Cleared lines fill by direct assignment — Line.Set's wide-cell repair has nothing to repair when every cell in the line is replaced.
Partial-width areas (DECSLRM margins): keep the shift semantics but use copy() per row instead of per-cell assignment.

Results

End-to-end seq 1 2000000 wall time in xerotty: 31s → 13.5s from this change alone (→ ~6–8s combined with a scrollback ring buffer in x/vt, PR incoming there).

Existing tests pass; behavior is identical — lines land in the same places with the same contents, only the storage shuffling changed.

Scrolling (DeleteLineArea at the top of the scroll region) shifted cells one by one — O(rows*cols) Cell struct copies per scrolled line. During bulk output every newline pays this, and it profiled at ~80% of total process CPU in a real terminal emulator (xerotty) flooding 2M lines. Full-width areas (the common case — whole-screen scroll regions) now rotate the line slice headers instead: O(rows) slice-header moves plus clearing the n recycled lines, reusing their storage. Cleared lines are filled by direct assignment — Line.Set's wide-cell repair has nothing to repair when every cell is replaced. Partial-width areas keep the shift but use copy() per row instead of per-cell assignment. Measured end-to-end (seq 1 2000000 into an 80x24 emulator): 31s -> 13.5s wall just from this change; combined with a scrollback ring buffer in x/vt it reaches ~6-8s.

seq 1 2000000 into an 80x24 window took ~31s — 15-25x behind ghostty (~1.5s) and xfce4-terminal (~2.7s). perf blamed upstream, layer by layer: - ultraviolet Buffer.DeleteLineArea shifted cells one struct assignment at a time on every scroll (~80% of CPU) - vt Scrollback.Push evicted via slices.Delete — an O(10k) memmove per line of output at the default cap (~47% of the remainder) - the trailing-blank trim paid interface-unwrapping color compares per cell, and whole-line clears ran wide-cell repair per cell Fixes are upstream PRs (charmbracelet/ultraviolet#119, charmbracelet/x#888; see also #887) — go.mod pins both modules to the LXXero fork commits carrying them until they merge, then the replaces drop. Result: ~6-10s wall, 3-5x faster, within ~3x of the purpose-built terminals from ~20x behind. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

LXXero mentioned this pull request Jun 6, 2026

perf(vt): ring-buffer Scrollback eviction + cheap trailing-blank scan charmbracelet/x#888

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: rotate line slices in Insert/DeleteLineArea full-width fast path#119

perf: rotate line slices in Insert/DeleteLineArea full-width fast path#119
LXXero wants to merge 1 commit into
charmbracelet:mainfrom
LXXero:scroll-fastpath

LXXero commented Jun 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

LXXero commented Jun 6, 2026

Problem

Change

Results

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant