Skip to content

ci: add ubuntu-26.04 to spread test matrix#215

Open
tonyandrewmeyer wants to merge 12 commits into
canonical:mainfrom
tonyandrewmeyer:ci/spread-resolute-as-default
Open

ci: add ubuntu-26.04 to spread test matrix#215
tonyandrewmeyer wants to merge 12 commits into
canonical:mainfrom
tonyandrewmeyer:ci/spread-resolute-as-default

Conversation

@tonyandrewmeyer

@tonyandrewmeyer tonyandrewmeyer commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

Add 26.04 alongside the existing 24.04 coverage so spread runs against both supported LTS releases.

The lxd backend's allocate script now maps $SPREAD_SYSTEM to the codename via a case statement, so each system requests the right LXD image. 26.04 is the fallback for any unrecognised system, future additions inherit the current LTS by default.

A canonical-repo-automation PR will be needed as well as this, to update the required checks.

Add 26.04 (Resolute Raccoon) alongside the existing 24.04 (Noble Numbat)
coverage so spread runs against both supported LTS releases.

Map $SPREAD_SYSTEM to the codename via a case statement in the lxd
backend's allocate script so each system requests the right LXD image.
Resolute is the fallback for any unrecognised system, so future
additions inherit the current LTS by default.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@tonyandrewmeyer tonyandrewmeyer marked this pull request as draft June 17, 2026 23:55
tonyandrewmeyer and others added 8 commits June 23, 2026 17:06
The spread.yaml change in this PR added ubuntu-26.04 alongside ubuntu-24.04
on both lxd and github-ci backends, but the CI workflow that drives spread
was still hardcoded to ubuntu-24.04 in two places: the suite-enumeration
sed strip and the final spread -v invocation. The result was that the
ubuntu-26.04 entry was declared in spread.yaml but never executed.

Make the matrix actually cross (suite x base): list suites from a single
base, then run each (base, suite) pair. Job names now include the base
so the per-base results are distinguishable. BASE comes from the matrix
rather than being burned into the spread target string.
The previous commit added ubuntu-26.04 to the spread matrix but left the
runner pinned to ubuntu-24.04. The github-ci backend is adhoc-on-localhost,
so spread rejects "github-ci:ubuntu-26.04:..." on a 24.04 host with
"nothing matches provider filter" — every 26.04 job failed in under 20s.

Replace the single runner input with a bases JSON list of {base, runner}
pairs. define-matrix now emits a flat include of (suite, base, runner)
tuples, and spread-test runs each job on the runner whose OS matches the
spread system it targets. push.yaml supplies both 24.04 and 26.04 runners;
test-on-alternative-arches.yaml supplies only the noble self-hosted runner,
since no 26.04 self-hosted runners exist for ppc64le/s390x.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Each task.yaml restricted itself to systems: [ubuntu-24.04], which filtered
out ubuntu-26.04 at the task level even though the workflow matrix and
spread.yaml backend both declared it. spread -list github-ci therefore
emitted only ubuntu-24.04 jobs, and the matrix rows for ubuntu-26.04 hit
"nothing matches provider filter" the moment they invoked spread.

Replace the per-task system pin with the glob "ubuntu-*" so every system
declared on the backend (24.04, 26.04, future LTSs) participates in the
suite/task cross-product.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Two unrelated 26.04-only failures came out of the new matrix:

1. Five tests asserted `python3 -m venv -h | head -n1 | grep -q 'usage: venv'`.
   Python 3.14 (shipped in 26.04) changed argparse's default prog derivation
   so `-m venv` help now starts with `usage: python3 -m venv [-h] ...` instead
   of `usage: venv [-h] ...`. Replace the help-string sniff with a functional
   check that actually exercises venv and is invariant to that formatting.

2. tests/microk8s-image-registry pinned `microk8s 1.31-strict/stable`. That
   channel's install hook fails on 26.04: snapctl restart of
   daemon-containerd cannot start the service. The dynamically picked
   1.36-strict/stable used by preset-microk8s installs cleanly on the same
   runner, so the issue is specifically that old channel. Bump the pin to
   1.32-strict/stable, the minimum bump that gets the test off the broken
   snap revision.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…icrok8s on either upstream

The 1.32 bump in the previous commit wasn't enough — 1.32-strict/stable
hits the same daemon-containerd start failure on 26.04. The sibling
preset-microk8s test installed 1.36-strict/stable (chosen dynamically)
on the same runner and worked, so move microk8s-image-registry to the
lowest channel known to work.

provider-microk8s exists to assert that microk8s#5394 still reproduces
(the 24.04 install hangs, `timeout` fires, rc=124 is "expected"). On
26.04 the same 1.31-strict snap fails fast with the daemon-containerd
issue instead (rc=1, concierge surfaces "failed to install MicroK8s").
Both failures are upstream, neither has a concierge-side fix.

Rewrite the assertion as an xfail with detection: accept either rc=124
(24.04 hang) or rc=1 plus the known concierge error string (26.04
install failure), and fail loudly on rc=0 (means an upstream fix has
landed and the dormant assertion block should be re-enabled) or any
other rc. Capture stderr so an unexpected rc=1 surfaces the actual
error in the log rather than hiding it.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Branch protection has been pinning individual Spread (<suite>, <base>)
checks, which churn every time a suite is added, removed, or renamed.
Replace that with a single aggregator job, "Spread (required)", that
gates only on the suites whose failure should actually block merge.

Split the matrix two ways in define-matrix:

  * critical: the 5 presets, the 6 providers, and the 4 dry-run/restore
    safety-net suites - the user-facing entry points and recovery paths.
  * extras: everything else (juju-*, *-image-registry, overrides-*,
    disable-*, extra-*, status-*, snap-disabled-installed). These still
    run on every PR so we keep the signal, but they don't gate merge.

spread-required runs with if: always() so it surfaces a definitive
result even when the matrix has failures, and treats anything other
than "success" on spread-critical (including skipped/cancelled) as a
red light - so a broken binaries or define-matrix job doesn't silently
let the gate go green.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@tonyandrewmeyer tonyandrewmeyer marked this pull request as ready for review June 27, 2026 07:44
@tonyandrewmeyer tonyandrewmeyer requested a review from tromai June 27, 2026 07:44

@tromai tromai left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.

I ran the bash commands in _tests.yaml in my machine, and it works as expected. I like your solution of having a Spread aggregator job in the CI. I have some minor requests to comments in _tests.yaml, but overall it looks great.

Comment thread .github/workflows/_tests.yaml
Comment thread .github/workflows/_tests.yaml Outdated
Comment thread .github/workflows/_tests.yaml
Comment thread tests/preset-machine/task.yaml
Reviewer asked for the bases input and critical/extras matrix configs
to call out their JSON structure explicitly so the jq downstream reads
more clearly.
@tonyandrewmeyer tonyandrewmeyer requested a review from tromai June 30, 2026 09:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants