[POC] gc: permanently mark pkg/sysimage objects to speed up GC#61474
[POC] gc: permanently mark pkg/sysimage objects to speed up GC#61474topolarity wants to merge 2 commits intomasterfrom
Conversation
|
I don't quite follow why we need a different remset? The core of the remset idea is that we keep track of the dynamic frontier between generations, so I don't see anything preventing us from using the current remset for the Or is the issue that during mark we need to keep track off if they object is coming from an image? So my idea would be that during a full GC you could look at the remset and scan all eternal objects. This would move us more cleany to a three generation GC, and of course the write-barrier would need to enque eternal objects that see writes of child objects in the young or old generation |
Load image objects as permanently marked (GC_OLD_MARKED) so gc_try_setmark_tag returns 0 immediately and the mark phase never enters the image subgraph. This reduces full-sweep mark times by 10-25x for typical workloads with loaded packages. Image objects live in separate mmap'd regions, are never freed, and are rarely mutated, making them ideal candidates for pretenuring. A persistent `image_remset` (htable) tracks image objects that have been mutated to reference non-image (collectable) objects. These are discovered at image load time by gc_scan_sysimg_remset and added incrementally by the write barrier in jl_gc_queue_root. After a full sweep (which clears per-thread remsets), gc_queue_image_remset pushes these entries to the mark queue so their children are properly traced. Quick sweeps don't need this because old objects retain their marks. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Each edge is labelled by the Module / package the image was generated for which should hopefully make it easy to investigate.
You're saying to compact the existing remset instead of clearing it? Sounds potentially expensive, but we could do that. IIUC multi-generation GC's usually do separate the remsets between generations, since it's unnecessary to scan through the remsets collected for younger generations. They just often distribute the remset across regions so that each region is only in one remset at a time. Or am I missing something? |
Image objects are already never freed and are ~somewhat rarely mutated, making them candidates for "pretenuring".
Load image objects as permanently marked (
GC_OLD_MARKED) sogc_try_setmark_tagreturns 0 immediately and the mark phase never enters the image subgraph. Maintain any mutations as a dedicatedimage_remsetwhich effectively roots any "new" referents that are referred to by the image subgraph.This significantly speeds up (full) GC pause times when dominated by immutable objects in the sysimage / pkgimage heaps:
Partial / "quick" GC pause times are essentially unchanged. Benchmark script
Co-authored-by: Claude Opus 4.6 noreply@anthropic.com 🤖
This commit is almost entirely written by Claude after laying out a plan together.
Draft because I think the sweep in(edit: done) This can likely also be generalized slightly to a "permalloc / pretenure" operation that applies to objects not necessarily in images. I'm not sure whether it'd be possible "unfreeze" those objects once they are promoted to permalloc'd / pretenured though.staticdata.cis probably unnecessary. I'm hoping to remove that before this is ready for review.