Skip to content

Chore/fix mutation measurement#92

Closed
goanpeca wants to merge 5 commits into
mainfrom
chore/fix-mutation-measurement
Closed

Chore/fix mutation measurement#92
goanpeca wants to merge 5 commits into
mainfrom
chore/fix-mutation-measurement

Conversation

@goanpeca

Copy link
Copy Markdown
Contributor

Summary

Checklist

  • pnpm all passes locally (lint + typecheck + test + build).
  • If src/ changed, dist/index.js was rebuilt and is included in this PR. CI will fail otherwise.
  • Test coverage for new logic added under __tests__/ (and passes the 95%/85%/100%/95% coverage gate).
  • If a new action: verb was added: it's listed in src/inputs.ts ActionName, wired in src/main.ts, documented in action.yml, and has a usage entry in the README.
  • If a new input or output was added: documented in action.yml and the README table.
  • If user-visible behavior changed: a CHANGELOG.md entry under [Unreleased].
  • If a new example workflow was added: listed in .github/workflows/README.md and runs against the project's test bucket.

How to test this

- uses: backblaze-labs/b2-action@<this-pr's-sha>
  with:
    action: ...

Notes for reviewers

goanpeca added 5 commits June 22, 2026 20:03
Stryker's vitest runner zeroes out src/main.ts (and under-counts other
files) when all 8 mutated files run in a single pass. The suite's
resetModules + doMock + dynamic-import style serves stale, un-mutated
modules to the covering tests at scale, so main.ts scored 0% in the full
run despite being ~86% covered in isolation. The headline dropped to
~42% and failed the 65% gate, even though the true score is ~71%.

coverageAnalysis "all" did not help and maxTestRunnerReuse 1 crashed
esbuild, so this measures each mutated file in its own Stryker run (where
attribution is accurate) and aggregates the per-file JSON reports, gating
on the combined score.

- add scripts/run-batched-mutation.mjs (per-file run + aggregate + gate)
- test:mutation now runs the batched runner; test:mutation:single keeps
  the raw `stryker run` for ad-hoc use
- per-file reports are saved under reports/mutation/by-file for the CI
  artifact, plus an aggregate.json summary

Aggregate is now 70.80% (>= 65). Genuinely lower files remain as future
work: download.ts 59%, purge.ts 57%, upload.ts 64%.
Targeted tests for the three files that were genuinely below threshold
when measured per-file (true scores, not the earlier full-run artifact):

- purge.ts 56.8% -> 87.8%: bucket-wide purge + confirmation warning,
  dry-run preview, non-slash prefix normalization, failed-version warning.
- upload.ts 64.2% -> 66.1%: progress label + fileId/sha1 summary line.
- download.ts 59.1% -> 67.0%: prefix group/progress/wrote logging,
  destination resolution (existing dir / trailing slash), SSE-C round-trip,
  Windows reserved-name rejection, file/dir path collision, and the win32
  rename-overwrite retry (fs + platform mocked).
Expand the Stryker mutate list to every src file (was 8 of 24) so mutation
testing reaches the whole action surface, and lift the newly-covered
laggards:

- head.ts 30% -> 90%, hide.ts 37.5% -> 100%, unhide.ts 57% -> 93%: assert
  group labels, info lines, and the requireSource error wording.
- progress.ts 57% -> 77%: pin percent/parts/total-suffix formatting,
  throttle skip vs first/final emit, and MB/s math (fake timers).
- fs.ts 67% -> 100%: drop a redundant explicit `return undefined` whose
  only surviving mutant was equivalent (implicit undefined return).

Add 'startgroup'/'endgroup' to the cspell allowlist; dist/ rebuilt.
The Stryker mutate list now spans every src file (was 8 of 24), so the
gate reflects the whole action. With the laggard files lifted, the real
aggregate is comfortably above 70, so raise thresholds.break 65 -> 70
(high 75 -> 80) to lock in the floor and prevent regressions.
CI failed the batched run with `error: unknown option '--thresholds.break'`
from the per-file Stryker call. Routing through
`pnpm exec stryker run ... --reporters clear-text,json` proved fragile, so
resolve Stryker's bin and run it with this Node binary directly, relying on
the JSON reporter already configured in stryker.conf.json. The child now
receives exactly `run --mutate <file>` with no package-manager layer.

Verified locally: per-file runs produce reports/mutation/mutation.json.
Copilot AI review requested due to automatic review settings June 23, 2026 15:06

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adjusts mutation testing so the project’s mutation score is measured accurately despite Vitest module-mocking patterns that can cause stale (unmutated) modules to be reused during large Stryker runs.

Changes:

  • Replaces the pnpm test:mutation script with a custom batched Stryker runner that mutates one file per run and aggregates results.
  • Expands the Stryker mutate list and raises the mutation-score thresholds.
  • Adds/extends tests to pin user-visible log output and to increase mutation/branch coverage for progress + command wrappers.

Reviewed changes

Copilot reviewed 9 out of 11 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
stryker.conf.json Expands mutate targets and tightens mutation thresholds to match new measurement approach.
src/fs.ts Adds a Stryker suppression comment for the tryStat catch block.
scripts/run-batched-mutation.mjs New batched mutation runner that runs Stryker per-file and aggregates JSON reports for gating.
package.json Points test:mutation at the new batched runner and adds test:mutation:single for the original behavior.
dist/index.js Rebuild output reflecting the src/fs.ts change.
.cspell/project-words.txt Adds startgroup/endgroup to avoid spellcheck noise from log assertions.
tests/progress.test.ts Adds tests that pin progress math/output for mutation robustness.
tests/commands/upload-download.test.ts Adds tests to increase log/branch coverage including Windows-specific paths.
tests/commands/head-purge-multipresign.test.ts Adds purge behavior tests (bucket-wide, dry-run, slash normalization, warning surface).
tests/commands/command-logs.test.ts New tests pinning head/hide/unhide log + error surface and tryStat behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +92 to +96
function score({ killed, timeout, survived, noCoverage }) {
const detected = killed + timeout
const valid = detected + survived + noCoverage
return valid === 0 ? 100 : (detected / valid) * 100
}
Comment on lines +23 to +33
import { execFileSync } from 'node:child_process'
import { copyFileSync, existsSync, mkdirSync, readFileSync, rmSync, writeFileSync } from 'node:fs'
import { createRequire } from 'node:module'

// Resolve Stryker's CLI entry and run it with this Node binary directly. Going
// through `pnpm exec` proved fragile in CI (a stray "--thresholds.break" arg
// surfaced), so we invoke Stryker without any package-manager layer.
const require = createRequire(import.meta.url)
const STRYKER_CLI = require
.resolve('@stryker-mutator/core/package.json')
.replace(/package\.json$/, 'bin/stryker.js')
const totals = zero()
const rows = []

mkdirSync(BY_FILE_DIR, { recursive: true })
Comment thread package.json
Comment on lines +15 to +16
"test:mutation": "node scripts/run-batched-mutation.mjs",
"test:mutation:single": "stryker run",
@goanpeca

Copy link
Copy Markdown
Contributor Author

Closing as obsolete. This branch (chore/fix-mutation-measurement) predates the refactor that is now on main — the batched runner was split into scripts/run-batched-mutation.mjs + scripts/run-batched-mutation-lib.mjs with its own unit tests. Merging this would revert that refactor (it deletes run-batched-mutation-lib.mjs, run-batched-mutation.test.ts, workflow-policy.test.ts, etc.), which is why it conflicts.

The test/coverage work from this effort is already on main, and the real CI failure (the unsupported --thresholds.break flag in the refactored lib) is fixed cleanly in #91. Use #91.

@goanpeca goanpeca closed this Jun 23, 2026
@goanpeca goanpeca deleted the chore/fix-mutation-measurement branch June 23, 2026 15:12
@goanpeca goanpeca self-assigned this Jun 23, 2026
@goanpeca goanpeca added testing Test coverage and verification gaps ci CI/CD pipeline and automation tech-debt Maintainability and follow-up cleanup duplicate This issue or pull request already exists labels Jun 23, 2026
@goanpeca goanpeca added this to the v1.1.0 milestone Jun 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci CI/CD pipeline and automation duplicate This issue or pull request already exists tech-debt Maintainability and follow-up cleanup testing Test coverage and verification gaps

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants