-
Notifications
You must be signed in to change notification settings - Fork 1
fix(mutation): measure per-file to correct Stryker misreporting #90
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 1 commit
Commits
Show all changes
10 commits
Select commit
Hold shift + click to select a range
9234f77
fix(mutation): measure per-file to correct Stryker misreporting
goanpeca 8e20c86
test(mutation): raise purge/upload/download above the 65% gate
goanpeca 12df56b
test(mutation): cover all src files + harden small commands
goanpeca 80c2b57
chore(mutation): cover all of src + ratchet break threshold to 70
goanpeca 2b63a62
fix: harden mutation runner review fixes
goanpeca ece6821
fix: address latest PR review comments
goanpeca 72d6266
fix: normalize mutation runner entrypoint
goanpeca ac3b1bb
fix: run PR gates for Dependabot
goanpeca b02c39b
test: tighten unhide marker assertion
goanpeca d81722f
chore: remove stale cspell word
goanpeca File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,120 @@ | ||
| #!/usr/bin/env node | ||
| /** | ||
| * Batched mutation runner. | ||
| * | ||
| * Why this exists: Stryker's vitest runner mis-measures this project when all | ||
| * mutated files are instrumented together in a single run. The repo's tests | ||
| * lean heavily on `vi.resetModules()` + `vi.doMock()` + dynamic `import()`, and | ||
| * at full scale (8 files / ~1500 mutants) the test-runner serves stale, | ||
| * un-mutated modules to the covering tests. The symptom: `src/main.ts` scores | ||
| * 0% in a full run but ~86% when mutated in isolation, dragging the aggregate | ||
| * far below its true value. `coverageAnalysis: "all"` does not fix it, and | ||
| * `maxTestRunnerReuse: 1` crashes esbuild. | ||
| * | ||
| * The fix: run Stryker once per mutated file (where measurement is accurate), | ||
| * then aggregate the per-file JSON reports ourselves and gate on the combined | ||
| * mutation score. Each per-file run is fast and correct; the sum is the true | ||
| * project score. | ||
| * | ||
| * Usage: node scripts/run-batched-mutation.mjs | ||
| * Exits non-zero when the aggregate score is below the configured break | ||
| * threshold (stryker.conf.json -> thresholds.break). | ||
| */ | ||
| import { execFileSync } from 'node:child_process' | ||
| import { copyFileSync, existsSync, mkdirSync, readFileSync, rmSync, writeFileSync } from 'node:fs' | ||
|
|
||
| const REPORT = 'reports/mutation/mutation.json' | ||
| const BY_FILE_DIR = 'reports/mutation/by-file' | ||
| const config = JSON.parse(readFileSync('stryker.conf.json', 'utf8')) | ||
| const files = config.mutate ?? [] | ||
| const threshold = config.thresholds?.break ?? 65 | ||
|
|
||
| if (files.length === 0) { | ||
| console.error('No files listed in stryker.conf.json "mutate"; nothing to do.') | ||
| process.exit(1) | ||
| } | ||
|
|
||
| const STATUS_KEYS = ['killed', 'timeout', 'survived', 'noCoverage', 'errors'] | ||
| const STATUS_MAP = { | ||
| Killed: 'killed', | ||
| Timeout: 'timeout', | ||
| Survived: 'survived', | ||
| NoCoverage: 'noCoverage', | ||
| RuntimeError: 'errors', | ||
| CompileError: 'errors', | ||
| } | ||
|
|
||
| const zero = () => Object.fromEntries(STATUS_KEYS.map((k) => [k, 0])) | ||
| const totals = zero() | ||
| const rows = [] | ||
|
|
||
| mkdirSync(BY_FILE_DIR, { recursive: true }) | ||
|
|
||
|
goanpeca marked this conversation as resolved.
Outdated
|
||
| for (const file of files) { | ||
| console.log(`\n=== mutating ${file} ===`) | ||
| rmSync(REPORT, { force: true }) | ||
| try { | ||
| // Per-file break threshold is irrelevant here; we aggregate and gate below. | ||
| // Stryker still writes the JSON report before any threshold-based exit. | ||
| execFileSync( | ||
| 'pnpm', | ||
| ['exec', 'stryker', 'run', '--mutate', file, '--reporters', 'clear-text,json'], | ||
| { stdio: 'inherit' }, | ||
| ) | ||
|
goanpeca marked this conversation as resolved.
Outdated
goanpeca marked this conversation as resolved.
Outdated
|
||
| } catch { | ||
| // Non-zero exit is expected when a single file is below its break | ||
| // threshold. A genuine crash is caught by the missing-report check below. | ||
| } | ||
| if (!existsSync(REPORT)) { | ||
| console.error(`\nStryker produced no report for ${file}; treating as a hard failure.`) | ||
| process.exit(1) | ||
| } | ||
| // Preserve each file's report; the shared path is overwritten next iteration. | ||
| copyFileSync(REPORT, `${BY_FILE_DIR}/${file.replaceAll('/', '__')}.json`) | ||
| const report = JSON.parse(readFileSync(REPORT, 'utf8')) | ||
| const counts = zero() | ||
| for (const f of Object.values(report.files ?? {})) { | ||
| for (const m of f.mutants ?? []) { | ||
| const key = STATUS_MAP[m.status] | ||
| if (key) counts[key] += 1 | ||
| } | ||
| } | ||
| rows.push({ file, ...counts, score: score(counts) }) | ||
| for (const k of STATUS_KEYS) totals[k] += counts[k] | ||
| } | ||
|
|
||
| function score({ killed, timeout, survived, noCoverage }) { | ||
| const detected = killed + timeout | ||
| const valid = detected + survived + noCoverage | ||
| return valid === 0 ? 100 : (detected / valid) * 100 | ||
| } | ||
|
|
||
| const pad = (s, n) => String(s).padEnd(n) | ||
| const padL = (s, n) => String(s).padStart(n) | ||
| console.log('\n================= Aggregate mutation report =================') | ||
| console.log( | ||
| `${pad('file', 28)}${padL('score', 8)}${padL('killed', 8)}${padL('time', 6)}${padL('surv', 6)}${padL('noCov', 7)}${padL('err', 5)}`, | ||
| ) | ||
| for (const r of rows) { | ||
| console.log( | ||
| `${pad(r.file, 28)}${padL(`${r.score.toFixed(2)}%`, 8)}${padL(r.killed, 8)}${padL(r.timeout, 6)}${padL(r.survived, 6)}${padL(r.noCoverage, 7)}${padL(r.errors, 5)}`, | ||
| ) | ||
| } | ||
| const agg = score(totals) | ||
| console.log('-'.repeat(60)) | ||
| console.log( | ||
| `${pad('ALL', 28)}${padL(`${agg.toFixed(2)}%`, 8)}${padL(totals.killed, 8)}${padL(totals.timeout, 6)}${padL(totals.survived, 6)}${padL(totals.noCoverage, 7)}${padL(totals.errors, 5)}`, | ||
| ) | ||
|
|
||
| // Persist an aggregate summary next to the per-file report for CI artifacts. | ||
| mkdirSync('reports/mutation', { recursive: true }) | ||
| const summary = { score: agg, threshold, totals, files: rows, generatedBy: 'run-batched-mutation' } | ||
| writeFileSync('reports/mutation/aggregate.json', JSON.stringify(summary, null, 2)) | ||
|
|
||
| if (agg < threshold) { | ||
| console.error( | ||
| `\nAggregate mutation score ${agg.toFixed(2)}% is below break threshold ${threshold}.`, | ||
| ) | ||
| process.exit(1) | ||
| } | ||
| console.log(`\nAggregate mutation score ${agg.toFixed(2)}% meets break threshold ${threshold}.`) | ||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.