Skip to content

ci: split Project Tests into one parallel job per exemplar (~45min → ~slowest-project)#18

Merged
docxology merged 2 commits into
mainfrom
improve/ci-project-job-split
Jun 4, 2026
Merged

ci: split Project Tests into one parallel job per exemplar (~45min → ~slowest-project)#18
docxology merged 2 commits into
mainfrom
improve/ci-project-job-split

Conversation

@docxology

Copy link
Copy Markdown
Owner

Stacked on #17 (base will retarget to main once #17 merges).

What

Replace the single sequential 9-exemplar test-project job (sum ~45 min on ubuntu) with a project-axis matrix: each public exemplar runs in its own parallel ubuntu job at py3.10 (floor) + py3.12 (ceiling) = 18 jobs.

Why

  • Wall-clock = slowest single project (~active_inference) instead of the sum of all nine.
  • Failure isolation: a red job names the exact project + Python version immediately.
  • Authoritative gating: each job runs scripts/01_run_tests.py --project <name> --project-only --include-slow, enforcing that project's own 90% floor (per CLAUDE.md) — stronger than the old 75% combined-union gate.
  • Removes the old template_code_project/fep_lean conftest plugin-name collision (every project is isolated in its own job).
  • macOS breadth stays in test-infra; Codecov upload (py3.12) merges per-project coverage by flag.

Validation

Locally: scripts/01_run_tests.py --project templates/template_newspaper --project-only --include-slow → 48/48, 94.1% (≥90%). YAML valid; matrix expands to 18 jobs. .github CI-structure docs (AGENTS/README + workflows/AGENTS/README) updated to match the new test-infra (ubuntu×3 + macOS 3.12) and test-project layouts.

This PR is the structural follow-up flagged in #17; its own CI run validates the new matrix before merge.

Replace the single sequential 9-exemplar test-project job (sum ~45 min on
ubuntu) with a project-axis matrix: each public exemplar runs in its own
parallel ubuntu job at py3.10 (floor) + py3.12 (ceiling) = 18 jobs. Wall-clock
becomes the slowest single project (~active_inference) instead of the sum, and
a failure now names the exact project immediately.

Each job runs 'scripts/01_run_tests.py --project <name> --project-only
--include-slow' and enforces that project's OWN 90% floor (authoritative per
CLAUDE.md), which also removes the old code_project/fep_lean conftest plugin
collision (every project is isolated in its own job). macOS breadth stays in
test-infra. Codecov upload (py3.12) merges per-project coverage by flag.

Updated .github CI-structure docs (AGENTS/README + workflows/AGENTS/README) to
match the new test-infra (ubuntu x3 + macOS 3.12) and test-project layouts.
Validated locally: 'scripts/01_run_tests.py --project templates/template_newspaper
--project-only --include-slow' = 48/48, 94.1% (>=90%).
@docxology docxology changed the base branch from improve/ci-speed-and-doc-accuracy to main June 4, 2026 20:19
@docxology docxology closed this Jun 4, 2026
@docxology docxology reopened this Jun 4, 2026
The job-split worked (every project PASSED its own 90% gate, e.g. newspaper
48/48 @ 94%), but the trailing 'uv run coverage xml -o coverage-project.xml'
ran at repo root where there is no data — scripts/01_run_tests.py --project
runs pytest with cwd=<project> and writes coverage into the project dir. CI's
coverage exits 1 on 'No data to report', so set -e failed all 18 jobs despite
green tests. The per-project --cov-fail-under=90 is enforced inside
01_run_tests.py; remove the redundant repo-root xml and point Codecov at the
project's own coverage_project.json (best-effort, fail_ci_if_error: false).
@docxology docxology merged commit d7ae0d3 into main Jun 4, 2026
35 checks passed
@docxology docxology deleted the improve/ci-project-job-split branch June 6, 2026 00:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant