Skip to content

fix(mutation): measure per-file to correct Stryker misreporting#90

Merged
goanpeca merged 10 commits into
mainfrom
chore/fix-mutation-measurement
Jun 23, 2026
Merged

fix(mutation): measure per-file to correct Stryker misreporting#90
goanpeca merged 10 commits into
mainfrom
chore/fix-mutation-measurement

Conversation

@goanpeca

@goanpeca goanpeca commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

Summary

This PR fixes the mutation gate by running Stryker once per configured mutation target, then aggregating the per-file JSON reports into a single score. The original 41.98% result was a Stryker Vitest runner measurement artifact caused by full-suite module isolation behavior, not the actual aggregate mutation score.

Implemented Fixes

  • Added scripts/run-batched-mutation.mjs and the tested runner library that powers pnpm test:mutation.
  • Kept pnpm test:mutation:single as the raw stryker run command for local diagnosis.
  • Disables Stryker's per-file break threshold during child runs and applies the configured aggregate gate once after merging results.
  • Treats missing reports, non-threshold child failures, errored mutants, and unknown mutant statuses as hard failures.
  • Honors thresholds.break: null by skipping the aggregate break gate when Stryker disables it.
  • Clears reports/mutation/by-file/ before each run and writes reports/mutation/aggregate.json for artifacts.
  • Uses pnpm.cmd on Windows and realpath-based entrypoint detection for robust CLI execution.
  • Restored the aggregate break threshold to 65% while documenting the intentional 24-target mutation scope.
  • Added runner regression tests for all-error, mixed-error, unknown-status, missing-report, nonzero-child-with-report, disabled-threshold, Windows command, stale report cleanup, and entrypoint normalization cases.
  • Added mutation-focused command and progress tests, including stronger purge, upload, download, command log, unhide marker, and progress coverage.
  • Extracted the platform-specific download replace operation behind a helper and regenerated dist/ output.
  • Removed Dependabot skip guards from CI and added a workflow-policy regression test so pull request security gates keep running for Dependabot PRs.
  • Removed the stale startgroup cspell allowlist entry.
  • Updated DEVELOPMENT.md and .github/workflows/README.md for the batched runner contract, report layout, scope, scripts, and threshold.

Scope and Baseline

The mutation scope expansion is intentional. stryker.conf.json now lists 24 action-owned src targets, covering support modules in src/*.ts and command implementations in src/commands/*.ts.

Scope Mutation score Killed Timed out Survived No coverage
All files 72.83% 895 11 333 5
src/commands/*.ts targets 62.59% 332 11 203 2
src/inputs.ts 73.24% 219 0 78 2
src/main.ts 85.67% 305 0 50 1
src/sse.ts 95.12% 39 0 2 0

The aggregate break threshold remains 65%.

Verification

lint, typecheck, test, build, cspell, and coverage hooks pass.

Stryker's vitest runner zeroes out src/main.ts (and under-counts other
files) when all 8 mutated files run in a single pass. The suite's
resetModules + doMock + dynamic-import style serves stale, un-mutated
modules to the covering tests at scale, so main.ts scored 0% in the full
run despite being ~86% covered in isolation. The headline dropped to
~42% and failed the 65% gate, even though the true score is ~71%.

coverageAnalysis "all" did not help and maxTestRunnerReuse 1 crashed
esbuild, so this measures each mutated file in its own Stryker run (where
attribution is accurate) and aggregates the per-file JSON reports, gating
on the combined score.

- add scripts/run-batched-mutation.mjs (per-file run + aggregate + gate)
- test:mutation now runs the batched runner; test:mutation:single keeps
  the raw `stryker run` for ad-hoc use
- per-file reports are saved under reports/mutation/by-file for the CI
  artifact, plus an aggregate.json summary

Aggregate is now 70.80% (>= 65). Genuinely lower files remain as future
work: download.ts 59%, purge.ts 57%, upload.ts 64%.
Copilot AI review requested due to automatic review settings June 23, 2026 01:04
@goanpeca goanpeca added testing Test coverage and verification gaps tech-debt Maintainability and follow-up cleanup ci CI/CD pipeline and automation labels Jun 23, 2026
@goanpeca goanpeca self-assigned this Jun 23, 2026
@goanpeca goanpeca added this to the v1.1.0 milestone Jun 23, 2026
@github-actions

Copy link
Copy Markdown

Build artifact for this PR: build.tar.gz (valid 1 hour)

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adjusts the project’s mutation-testing workflow to work around a Stryker+Vitest measurement issue by running mutation testing per file and aggregating results, ensuring the mutation gate reflects accurate attribution for this codebase’s heavy module-mocking patterns.

Changes:

  • Add a new batched mutation runner script that runs Stryker once per mutated file, aggregates JSON reports, and enforces the configured break threshold.
  • Update test:mutation to use the batched runner while preserving a test:mutation:single escape hatch for ad-hoc raw Stryker runs.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
scripts/run-batched-mutation.mjs New per-file Stryker runner that aggregates per-file JSON results and gates on the combined score.
package.json Switch test:mutation to the batched runner and add test:mutation:single.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread scripts/run-batched-mutation.mjs Outdated
Comment thread scripts/run-batched-mutation.mjs Outdated
Targeted tests for the three files that were genuinely below threshold
when measured per-file (true scores, not the earlier full-run artifact):

- purge.ts 56.8% -> 87.8%: bucket-wide purge + confirmation warning,
  dry-run preview, non-slash prefix normalization, failed-version warning.
- upload.ts 64.2% -> 66.1%: progress label + fileId/sha1 summary line.
- download.ts 59.1% -> 67.0%: prefix group/progress/wrote logging,
  destination resolution (existing dir / trailing slash), SSE-C round-trip,
  Windows reserved-name rejection, file/dir path collision, and the win32
  rename-overwrite retry (fs + platform mocked).
@github-actions

Copy link
Copy Markdown

Build artifact for this PR: build.tar.gz (valid 1 hour)

Expand the Stryker mutate list to every src file (was 8 of 24) so mutation
testing reaches the whole action surface, and lift the newly-covered
laggards:

- head.ts 30% -> 90%, hide.ts 37.5% -> 100%, unhide.ts 57% -> 93%: assert
  group labels, info lines, and the requireSource error wording.
- progress.ts 57% -> 77%: pin percent/parts/total-suffix formatting,
  throttle skip vs first/final emit, and MB/s math (fake timers).
- fs.ts 67% -> 100%: drop a redundant explicit `return undefined` whose
  only surviving mutant was equivalent (implicit undefined return).

Add 'startgroup'/'endgroup' to the cspell allowlist; dist/ rebuilt.
Copilot AI review requested due to automatic review settings June 23, 2026 02:19
@github-actions

Copy link
Copy Markdown

Build artifact for this PR: build.tar.gz (valid 1 hour)

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 11 changed files in this pull request and generated 3 comments.

Comment thread scripts/run-batched-mutation.mjs Outdated
Comment thread scripts/run-batched-mutation.mjs Outdated
Comment thread src/fs.ts Outdated
The Stryker mutate list now spans every src file (was 8 of 24), so the
gate reflects the whole action. With the laggard files lifted, the real
aggregate is comfortably above 70, so raise thresholds.break 65 -> 70
(high 75 -> 80) to lock in the floor and prevent regressions.
@github-actions

Copy link
Copy Markdown

Build artifact for this PR: build.tar.gz (valid 1 hour)

Copilot AI review requested due to automatic review settings June 23, 2026 02:51
@github-actions

Copy link
Copy Markdown

Build artifact for this PR: build.tar.gz (valid 1 hour)

@github-actions

Copy link
Copy Markdown

Build artifact for this PR: build.tar.gz (valid 1 hour)

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 14 out of 16 changed files in this pull request and generated 1 comment.

Comment thread scripts/run-batched-mutation-lib.mjs

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 14 out of 16 changed files in this pull request and generated 2 comments.

Comment thread stryker.conf.json
Comment thread scripts/run-batched-mutation-lib.mjs
@github-actions

Copy link
Copy Markdown

Build artifact for this PR: build.tar.gz (valid 1 hour)

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 14 out of 16 changed files in this pull request and generated no new comments.

@github-actions

Copy link
Copy Markdown

Build artifact for this PR: build.tar.gz (valid 1 hour)

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 19 out of 21 changed files in this pull request and generated 1 comment.

Comments suppressed due to low confidence (1)

.github/workflows/full-lockfile-audit.yml:41

  • This job requests issues: write at the job level even on pull_request runs. Since the PR build checks out and executes repo code (including scripts/full-lockfile-audit.mjs), granting write permissions on PR runs increases the blast radius if a PR ever manages to execute unintended code. Consider splitting into two jobs (PR: contents: read only; non-PR: issues: write + tracking-issue steps) so PR runs never receive issues: write.
    runs-on: ubuntu-latest
    timeout-minutes: 10
    permissions:
      contents: read # check out the repository and read the committed lockfile
      issues: write # open or update default-branch tracking issues

Comment thread __tests__/commands/command-logs.test.ts Outdated
@github-actions

Copy link
Copy Markdown

Build artifact for this PR: build.tar.gz (valid 1 hour)

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 19 out of 21 changed files in this pull request and generated 1 comment.

Comment thread .cspell/project-words.txt Outdated
@github-actions

Copy link
Copy Markdown

Build artifact for this PR: build.tar.gz (valid 1 hour)

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 19 out of 21 changed files in this pull request and generated no new comments.

@goanpeca goanpeca merged commit 829c2d0 into main Jun 23, 2026
33 checks passed
@goanpeca goanpeca deleted the chore/fix-mutation-measurement branch June 23, 2026 13:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci CI/CD pipeline and automation tech-debt Maintainability and follow-up cleanup testing Test coverage and verification gaps

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants