feat(nix): QUIC-enabled Node 26 toolchain (node:quic runtime)#1372
feat(nix): QUIC-enabled Node 26 toolchain (node:quic runtime)#1372srid wants to merge 5 commits into
Conversation
Migrate the whole repo from Node 24 to a Node 26 built with
--experimental-quic, so `require("node:quic")` is available everywhere
`pkgs.nodejs` flows. This is the day-one runtime for kaval's roaming
remote transport — QUIC connection migration carries an attached session
across a network change with no reconnect (docs/atlas kaval-vs-zmosh).
- npins: bump nixpkgs-unstable to a rev carrying nodejs_26 (26.3.0).
- overlay: rebuild nodejs-slim_26 (the real compile — the full `nodejs`
is a thin join over it) with --experimental-quic, and un-share openssl
+ ngtcp2 + nghttp3 so Node builds its bundled QUIC stack. Node only
wires the vendored ngtcp2/nghttp3 — the copies whose internal headers
src/quic/*.cc needs — under node_use_quic && node_shared_openssl=="false"
(node.gypi), and nixpkgs' shared nghttp3 ships public headers only, so
the QUIC stack must be vendored on a bundled openssl 3.5. nghttp2
(HTTP/2) and the other shared libs are untouched. `nodejs` is aliased
to the QUIC-enabled full Node 26 — one swap migrates devShell, the
kaval/agent closures, pnpm-typecheck, and the website.
- flake: a `node-quic` check asserting require("node:quic") loads under
--experimental-quic, realized on every platform by the nix/flake-check
CI nodes.
Refs the kaval-vs-zmosh atlas note (P1 runtime prerequisite).
EvidenceThis is a build-toolchain/runtime change (no on-screen surface), so the behavior worth proving is the runtime capability itself: that
The check runs Whole-repo migration is clean —
Compile cost (the one-time from-source build, since it's not cacheable upstream): 13 min 31s on a 32-core box — full fetch + bundled openssl + V8 + ngtcp2/nghttp3 + link + Node's own test suite + the smoke. See the PR's "User impact" section for the laptop estimate and the cache-push follow-up. |
The nixpkgs bump for Node 26 also removed the `nodePackages` set, which
throws on eval ("nodePackages has been removed"). shell.nix referenced
`nodePackages.node-gyp`; it's now `pkgs.node-gyp` (12.3.0) at the top
level. Only the devShell eval (ci::flake-check / install / atlas-sync)
hit this — the `.#default`/`.#checks` builds don't evaluate shell.nix.
…ht-driver The nixpkgs bump (for Node 26) moved `playwright-driver` 1.57 -> 1.60, which ships a newer chromium revision. The npm `playwright` pin stayed 1.57, so the e2e browser launch looked for chromium_headless_shell-1200 at PLAYWRIGHT_BROWSERS_PATH (the nix driver) and failed. Bumping the npm side to 1.60 realigns it. playwright lives in packages/tests, which is excluded from the pnpmDeps fileset (default.nix), so the FOD hash is unchanged; the e2e's own install picks up 1.60.
… UDP)
With --experimental-quic compiled in, Node's check phase runs
parallel/test-quic-* — experimental tests that need UDP/loopback the
macOS Nix sandbox forbids, so they fail on aarch64-darwin (e.g.
test-quic-h3-zero-rtt) even though require('node:quic') works (the
node-quic flake check proves it). They pass in the Linux sandbox, so
doCheck stays on for Linux (full packaging validation, and the Linux
node's hash is unchanged) and is disabled only on Darwin.
🧪 CI metrics — leased pool boxThe x86_64-linux lane ran on
Pool status (8 boxes)
Posted by |
✅ Green on both platformsCI passes on x86_64-linux AND aarch64-darwin (28/28 contexts + SentinelOne) at Darwin needed two extra fixes beyond linux (now in the PR):
Ready for review. |
What
Migrate the whole repo from Node 24 to a Node 26 built with
--experimental-quic, sorequire("node:quic")— Node's built-in QUIC — is available everywherepkgs.nodejsflows (devShell, the kaval/agent closures, pnpm-typecheck, the website). One overlay swap; no app code changes.This is the day-one runtime prerequisite for kaval's roaming remote transport: QUIC connection migration lets an attached
kaval-tui --hostsession survive a network change (Wi-Fi↔cellular, sleep/wake) with no reconnect — the P1 foundation from the kaval-vs-zmosh Atlas note. No transport code lands here — only the Node that can speak it.The change (nix-only)
npins/sources.json— bump nixpkgs-unstable to a rev carryingnodejs_26(26.3.0) + openssl 3.6.2.nix/overlay.nix— rebuildnodejs-slim_26(the real compile; the fullnodejsis a thin join over itsout+npm, so the flag must land on the slim and propagates up via the fixpoint) with--experimental-quic, and un-share openssl + ngtcp2 + nghttp3 so Node compiles its bundled QUIC stack (see "Why not nixpkgs" below for why that's required).nodejsis aliased to the QUIC-enabled full Node 26. Also pinpnpmto v10 — the nixpkgs bump moved the default pnpm to 11, which stopped readingpackage.json'spnpmfield and breaks our frozen install (ERR_PNPM_LOCKFILE_CONFIG_MISMATCH); a pnpm-11 config migration is its own change.flake.nix— anode-quiccheck that assertsrequire("node:quic")loads under--experimental-quicandprocess.features.quic === true, realized on every platform by the existingnix/flake-checkCI nodes.✅ Verification (built off-machine on real Linux hosts)
.#checks.x86_64-linux.node-quic— green..#default+.#checks.x86_64-linux.typecheck— green: node-pty recompiles clean against Node 26 (it's Node-API, ABI-stable across majors), pnpm install clean, typecheck passes.User impact (the cost, honestly)
The custom Node isn't in any binary cache (it can't be — see below), so a cold
nix run github:juspay/kolu#kaval/nix developcompiles Node from source once per machine, then it's served from/nix/storeforever after.Mitigation — our own cache (recommended follow-up). The repo already lists
cache.nixos.asia/ossas a substituter (read), but nothing in-repo pushes to it — it's populated out-of-band by juspay infra, so today a cold machine compiles. To spare end users (and cold/re-created CI boxes, and the darwin lane) the compile, we should push master's closure — including this Node — tocache.nixos.asia/oss(an Attic/nix copystep with the OSS signing key). Thennix runsubstitutes instead of compiling. I left this out of this PR because it's separate infra from the runtime; happy to do it next.Why isn't this just in nixpkgs?
Investigated exhaustively (nixpkgs issues, PRs, code, and Discourse): there is no issue and no PR — open, closed, or merged — to enable
--experimental-quic/node:quicin nixpkgs' nodejs. Nobody has pushed it upstream. It's absent by design, for three reasons:node:quicneeds a compile-time and runtime--experimental-quicflag; it's experimental and was disabled during the OpenSSL-3.5 transition. nixpkgs builds the stable default config.node.gypionly wires the vendored ngtcp2/nghttp3 — the copies whose internal headers (nghttp3/lib/nghttp3_conn.h, used bysrc/quic/*.cc) Node needs — undernode_use_quic && node_shared_openssl=="false"(a bundled openssl). So with--shared-openssl,HAVE_QUIC=1compiles the QUIC sources but never adds the ngtcp2/nghttp3 dependency → the build fails on the missing header. nixpkgs' default is structurally incompatible; that's exactly why this overlay un-shares openssl/ngtcp2/nghttp3 and lets Node vendor them.nghttp3is curl-oriented — it ships public headers only (no internallib/*.h), and at a newer version (1.15 vs Node's vendored 1.11), so even forcing the shared path can't work.Tellingly, the sole nixpkgs nodejs maintainer (
aduh95) is himself a Node.js core / QUIC collaborator and still hasn't enabled it — confirming this is a deliberate consequence of the shared-libs dedup design, not an oversight. (node:quicgraduating upstream later means we just drop the--experimental-quicflag — no other change.)Follow-ups (separate PRs)
cache.nixos.asia/ossso cold machines substitute instead of compiling.pnpm.*frompackage.jsontopnpm-workspace.yaml), then unpin pnpm.links/quic.ts+ theHostSessionQUIC dial seam +kaval-tui --hostover QUIC (the roaming demo), on top of this runtime.🤖 Generated with Claude Code