Skip to content

ci(windows): speed up R CMD check with dev profile, lld, prebuilt just#190

Open
CGMossa wants to merge 3 commits intomainfrom
windows-ci-speedup
Open

ci(windows): speed up R CMD check with dev profile, lld, prebuilt just#190
CGMossa wants to merge 3 commits intomainfrom
windows-ci-speedup

Conversation

@CGMossa
Copy link
Copy Markdown
Collaborator

@CGMossa CGMossa commented Apr 17, 2026

Summary

The Windows R CMD check job was spending nearly all of its time in the step
after `* checking serialization versions ... OK` — that is `checkingLog(Log,
"whether package 'miniextendr' can be installed")` (see
`r-svn/src/library/tools/R/check.R:5952`), which shells out to `R CMD INSTALL`
and ultimately runs `cargo build` + `cargo rustc --crate-type cdylib` through
MinGW.

Four compounding speedups on the Windows job:

  • `CARGO_PROFILE: dev` — release profile paid for monomorphization and
    LLVM codegen that R CMD check never exercises. Dev still keeps
    `codegen-units = 1` (required for linkme `distributed_slice` in the user
    crate).
  • `rustflags = ["-C", "link-arg=-fuse-ld=lld"]` — MinGW's default
    `ld.bfd` is the single slowest step in the install phase on Windows.
    rust-lld is shipped with the rustup toolchain and satisfies
    `-fuse-ld=lld` via the host gcc. Applies to both the staticlib and cdylib
    link passes.
  • `taiki-e/install-action@just` — the Linux and macOS jobs already use
    the prebuilt action; Windows was uniquely compiling `just` from source
    every run via `cargo install just`.
  • `[profile.dev]` / `[profile.release]` `incremental = false` — per the
    sccache note already in `CLAUDE.md`, incremental compilation writes
    per-invocation unique hashes that poison the sccache cache key. Turning it
    off makes rustc output deterministic so sccache hit rate approaches 100% on
    the fresh CI runner.

Plus a small hygiene fix: `.claude/worktrees/` and `.claude/scheduled_tasks.lock`
are now properly gitignored instead of showing up as untracked every session.

Risk

The lld linker flag is the only change with non-trivial risk: if Rtools45's
bundled gcc can't resolve `lld` on PATH, the link step fails fast with a
clear error and we revert that one line. Everything else is independent and
safe on its own.

No workspace-crate `.rs` changes, so `inst/vendor.tar.xz` does not need
regeneration for this PR.

Test plan

  • Windows R CMD check job passes
  • Windows job wall-clock time is meaningfully lower than on `main`
  • Linux and macOS jobs unaffected
  • `cran-check` job (which builds from the source tarball, no cargo config override) unaffected

Generated with Claude Code

CGMossa and others added 3 commits April 17, 2026 09:07
… prebuilt just

The Windows job's dominant cost is `* checking whether package 'miniextendr'
can be installed ...` — i.e. `cargo build` + `cargo rustc --crate-type cdylib`
under MinGW. Four compounding wins:

- CARGO_PROFILE=dev: skip release monomorphization; keep codegen-units=1 for
  linkme. R CMD check doesn't exercise optimized paths.
- rust-lld via -fuse-ld=lld: bypass the slow MinGW bfd linker on both
  staticlib and cdylib passes.
- taiki-e/install-action@just: stop compiling just from source every run.
- [profile.dev/release] incremental = false: deterministic rustc output so
  sccache hit rate approaches 100% on fresh runners.

Also add `.claude/worktrees/` and `.claude/scheduled_tasks.lock` to .gitignore
so agent worktrees stop showing as untracked.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
ld.lld rejects zero-byte files as "unknown file type"; ld.bfd tolerated
them silently. Using `ar crs` produces a real (empty) archive with the
`!<arch>` magic header, which both linkers accept. This only surfaced
after enabling `-fuse-ld=lld` in the previous commit.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Disabling incremental in [profile.*] would affect local dev loops too.
Setting CARGO_INCREMENTAL=0 at the workflow env level scopes it to CI,
so sccache hit rate benefits everywhere without burdening the edit/compile
cycle on developer machines.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@CGMossa CGMossa force-pushed the windows-ci-speedup branch from b8990bf to 8a6c524 Compare April 17, 2026 07:07
@CGMossa
Copy link
Copy Markdown
Collaborator Author

CGMossa commented Apr 17, 2026

Windows failure root cause (from the prior run's log before rebase):

[Rust] Worker: about to panic
[Rust] Dropped: worker: boxed resource before panic
[Rust] Dropped: worker: resource before panic
fatal runtime error: failed to initiate panic, error 5, aborting

Repeated across many tests — it's panic-unwinding broken at runtime, not compile/link. The three changes in this PR that could interact with unwinding are:

  1. -C link-arg=-fuse-ld=lldmost likely culprit. rust-lld drops or mishandles the SEH unwind tables that the MinGW std build expects. ld.bfd keeps them intact.
  2. ar crs libgcc_mock/libgcc_eh.a — valid empty archive, but with zero members. No __gcc_personality_v0 etc. Probably latent with bfd + abort unwinding; exposed once lld gets involved.
  3. CARGO_PROFILE=dev — should be orthogonal to unwinding, but worth noting.

Suggested narrowing on the Windows box:

  • First try dropping only rustflags = ["-C", "link-arg=-fuse-ld=lld"] (revert to bfd) — keep CARGO_PROFILE=dev, CARGO_INCREMENTAL=0, prebuilt just, and the ar crs archive trick. Hypothesis: that alone flips CI to green without giving up the other speedups.
  • If still red, the libgcc_mock/ stubs themselves may need real content (or -C panic=abort to sidestep unwinding entirely, though that changes semantics).

I've rebased onto main; push looks like it just bumped CI back into progress.

@CGMossa
Copy link
Copy Markdown
Collaborator Author

CGMossa commented Apr 17, 2026

Correction on the suggestion above — linker speed is the whole point of this PR, so reverting lld isn't the path. Keep lld, swap the runtime story.

Best-looking fixes that keep -fuse-ld=lld:

  1. Stop mocking libgcc_eh.a / libgcc_s.a. Real __gcc_personality_v0 is what's missing at runtime, and the empty archive satisfies the linker but leaves the personality symbol unbound. Use the x86_64-w64-mingw32.static.posix-gcc toolchain's own libgcc (drop the libgcc_mock/ hack + LIBRARY_PATH override), or install just those two .a files into CI instead of stubbing. That gives lld real unwind tables to reference.

  2. Switch target to x86_64-pc-windows-gnullvm. LLVM-native Windows-GNU triple, designed to work with lld and uses libunwind/compiler-rt — no mingw personality dance at all. Bigger surface change (different toolchain install), but it's the clean answer for "fast lld-linked Windows build."

  3. If neither pans out quickly and unwinding semantics are tolerable: -C panic=abort for this CI job only. Would need to verify that with_r_unwind_protect / panic→R-error paths in miniextendr-api still behave on abort — I suspect they rely on unwinding, so this is a last resort.

I'd try #1 first since it's the smallest diff to this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant