Skip to content

quality pass: bugfixes, hot-path performance, and consolidation#104

Merged
raulk merged 22 commits into
mainfrom
refactor/tier1
Jun 9, 2026
Merged

quality pass: bugfixes, hot-path performance, and consolidation#104
raulk merged 22 commits into
mainfrom
refactor/tier1

Conversation

@raulk

@raulk raulk commented Jun 9, 2026

Copy link
Copy Markdown
Collaborator

A behavior-preserving tier-1 quality pass over the host, SDK, CLI, and providers, plus a few real bugfixes and hot-path performance work.

Bugfixes

  • Negative-lookup cache no longer grows without bound — expired "not found" entries are now swept with the tombstone GC instead of lingering until their path is re-read or invalidated.
  • arXiv provider no longer panics on a JSON encode failure — pretty-JSON serialization is now one shared SDK helper with a single non-panicking policy.
  • db provider uses hashbrown for predictable map behavior on wasm32-wasip2 (was the lone holdout on std).

Hot-path performance

  • read_file: one canonical-object lookup per call instead of two, reusing a single byte buffer.
  • Object-cache writes: one redb transaction per multi-object projection instead of one fsync per object.
  • Object leaf-set merge and non-exhaustive directory-listing merge are no longer quadratic.
  • Observable results are unchanged.

Consolidation

  • Route-matching machinery moved from omnifs-core into its only consumer, the SDK.
  • Op lifecycle now has one home; container operations route through a single Docker seam.
  • dev/up share one launch choreography that also binds providers' required host paths (e.g. the db provider's SQLite directory) into the container.
  • Per-provider boilerplate (callout extraction, inline projection, header building, page cursors, pretty-JSON) collapsed into shared SDK helpers.
  • Every unsafe block now carries a SAFETY comment.

raulk added 20 commits June 9, 2026 18:58
…unch

Carries the in-flight working-tree changes onto the refactor branch so
the tier-1 refactor commits stay isolated. populate() now returns the
preopen bind strings and dev/up feed them into ContainerExtras.
read_file ran cached_canonical_for twice per call (two redb read
transactions, two decodes, two byte-buffer allocations) and cloned the
canonical bytes into every coalescing closure. One lookup now feeds both
uses and the executed branch reuses one buffer.
Pattern, CaptureLocation, and their support code were consumed only by
the SDK router yet lived in omnifs-core, so every matcher edit rebuilt
host, fuse, cli, and all providers. Route matching is SDK-side knowledge
per the byte boundary; it now lives beside its only consumer. Path,
Segment, and ParseError stay in core.
logs shelled out to the docker CLI and doctor hand-rolled its own
bollard client while Runtime already wrapped the API; all container
verbs now go through Runtime.

Container body construction is split into a pure ContainerLaunchSpec
assembled without bollard types, so launch wiring is unit-testable
without Docker.
The inspector was reachable only through a global, so anything touching
trace state needed the full harness, and the redaction rules that keep
tokens out of logs had no direct tests. Runtime now carries an injected
sink (production wiring unchanged, defaulting to the configured global)
and redaction.rs tests its scrubbing contract directly.
dev and up each hand-wrote the prepare, populate, connect, launch, wait,
verify, disarm sequence, and the rule that cleanup must disarm only
after verify_status succeeded was positional. launch_session() now owns
the ordering and the cleanup guard; the commands supply only their mount
source and container extras.
materialize_preopened_paths never used self and assigned a fresh clone
where clone_from reuses the allocation.
runtime.rs mixed WASI plumbing, the op suspend/resume loop, and
instance lifecycle, and effect application was invoked from two
modules. The WasiView impl moves to its own module and op_lifecycle
now owns run_op, finish_provider_return, and the Materializer call,
so a read-file round trip is traceable through one seam.
Provider-internal hashed containers use hashbrown for predictable
behavior on wasm32-wasip2; db was the one holdout on std.
docker, db, and arxiv each hand-rolled serialize-pretty-plus-newline
with three different failure policies; arxiv's panicked the provider on
encode failure. One SDK helper now owns the behavior.
blob, git, archive, and http each hand-wrote the same unwrap-variant,
map-error, reject-mismatch closure with drifting messages. One generic
expect_callout helper owns the shape; each kind supplies its picker.
github and linear each carried an identical inline_projection helper
assembling FileProjection from FileContent.
FileProjection::from_content now owns the assembly invariants.
http::Request and endpoint::RequestBuilder kept two header
representations and two WIT conversion paths for the same job. The
endpoint builder now stores HeaderMap and reuses the WitHeaders
conversion; builder signatures and error semantics are unchanged.
Every paged handler opened with the same match-and-default unwrap of
Cursor::Page, with per-provider default conventions easy to copy wrong.
DirCx::page_cursor makes the default explicit at the call site.
merge_leaves scanned the merged vec per incoming leaf, going quadratic
on objects with large alias sets; every canonical store and index-only
upsert paid it.
Each canonical-store effect opened its own redb write transaction, so a
projection materializing N objects paid N fsynced commits. Effect
application now stages the batch and commits once; per-entry fence
checks and prior-leaf view evictions are unchanged.
Tombstones had a GC but negative entries lingered until their exact
path was read again or invalidated, growing without bound on
long-running mounts with many failed lookups.
DirentsPayload::merged cloned every existing entry name into a fresh
HashMap and reallocated the full entry vec on every non-exhaustive
merge, costing O(directory) per page. The merge now replaces matched
entries in place and appends introductions, with no name cloning and a
deterministic order (existing first, introductions in name order).
Three render paths cloned full record vectors (with their String fields)
just to sort and iterate; sorting a vec of references avoids the copies.

The wider clone-churn sweep planned here was dropped on evidence: the
status DTOs are deserialized back from container output in
runtime.rs so they must stay owned, and the remaining cache/listing
clones are ownership-required (Arc construction, by-value APIs).
Eight unsafe sites (libc uid/gid calls and test-only env mutation)
carried no stated invariants.
@raulk raulk changed the title codebase quality improvements Tier-1 quality pass: bugfixes, hot-path performance, and consolidation Jun 9, 2026
@raulk raulk changed the title Tier-1 quality pass: bugfixes, hot-path performance, and consolidation quality pass: bugfixes, hot-path performance, and consolidation Jun 9, 2026
@raulk raulk force-pushed the refactor/tier1 branch 2 times, most recently from d5d55ff to 6bdfd83 Compare June 9, 2026 22:57
raulk added 2 commits June 10, 2026 00:05
doc_markdown backticks, cloned_ref_to_slice_refs, and unnecessary_map_or in the object-batch and negative-sweep tests.
@raulk raulk force-pushed the refactor/tier1 branch from f831804 to 8a353f9 Compare June 9, 2026 23:05
@raulk raulk merged commit d58a494 into main Jun 9, 2026
12 checks passed
@raulk raulk deleted the refactor/tier1 branch June 9, 2026 23:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant