[MCO-1997] Add test for osImageStream by ptalgulk01 · Pull Request #5881 · openshift/machine-config-operator

ptalgulk01 · 2026-04-24T13:08:23Z

Test Cases Added

OCP-88122: Validate osImageStream inheritance for custom MCPs
Validates dynamic inheritance behavior in custom MachineConfigPools:

Case 1: Custom MCPs inherit worker pool's default osImageStream (rhel-9)
Case 2: Dynamic inheritance - infra MCP created before worker osImageStream change inherits the new value (rhel-10)
Case 3: Custom MCPs can override with explicit osImageStream values

OCP-88708: Verify osstream logs for boot image controller
Validates boot image controller logging for unsupported osImageStream values:

Verifies controller logs appropriate warning messages
Changes to supported stream and verifies logs stop
Tests CPMS integration with managed boot images

OCP-88203: Verify MOSB triggered when osImageStream is patched
Validates MachineOSBuild resource creation when osImageStream changes:

Patches custom MCP osImageStream to rhel-10
Verifies MOSB resource is created and completes successfully
Validates node uses correct osImage from new stream

OCP-88365: Verify MCP degraded when osstream and osImageURL MC applied
Validates error handling for conflicting osstream and osImageURL configurations:

Applies MachineConfig with both osstream and osImageURL fields
Verifies MCP enters degraded state with appropriate error message
Tests recovery after removing conflicting configuration

Implementation Details

Uses custom-machine-config-pool.yaml template for MCPs without osImageStream field (dynamic inheritance)
Adds AddNodeToPool() helper method to simplify node addition to custom pools
Uses NewController().GetLogs() for boot image controller log verification
Validates osImage on nodes matches expected stream values

Summary by CodeRabbit

Tests
- Added test infrastructure for node pool configuration operations with synchronization validation.
- Added test cases covering OS image stream behavior across pool configurations, including inheritance, updates, conflict detection, and logging scenarios.

Add OCP-88122 test case to validate that custom MachineConfigPools dynamically inherit osImageStream from worker pool when not explicitly configured. Test validates three scenarios: 1. Custom MCPs inherit worker pool's default osImageStream (rhel-9) 2. Custom MCP created before worker pool osImageStream change inherits the new value (rhel-10) dynamically 3. Custom MCPs can override with explicit osImageStream values Also adds AddNodeToPool helper method to simplify adding nodes to custom pools during testing. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

openshift-merge-bot · 2026-04-24T13:08:25Z

Pipeline controller notification
This repo is configured to use the pipeline controller. Second-stage tests will be triggered either automatically or after lgtm label is added, depending on the repository configuration. The pipeline controller will automatically detect which contexts are required and will utilize /test Prow commands to trigger the second stage.

For optional jobs, comment /test ? to see a list of all defined jobs. To trigger manually all jobs from second stage use /pipeline required command.

This repository is configured in: LGTM mode

coderabbitai · 2026-04-24T13:08:36Z

Walkthrough

Adds a test helper method for moving nodes into machine config pools and introduces four new disruptive test cases validating osImageStream behavior across MachineConfigPools, MachineOSBuild triggering, conflict detection, and boot-image-controller logging.

Changes

Cohort / File(s)	Summary
Node Pool Helper Method `test/extended-priv/machineconfigpool.go`	Adds `AddNodeToPool()` helper that applies the pool label to a node and waits for MCP status synchronization (machine count increment and Updated condition).
osImageStream Test Cases `test/extended-priv/mco_osimagestream.go`	Adds four new Ginkgo test cases: inheritance validation across custom MCPs (88122), MachineOSBuild triggering on osImageStream patch (88203), degradation handling for osImageURL/osImageStream conflicts (88365), and boot-image-controller logging for unsupported streams with CPMS integration (88708).

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 9 | ❌ 3

❌ Failed checks (3 warnings)

Check name	Status	Explanation	Resolution
Test Structure And Quality	⚠️ Warning	Pull request contains four test quality issues: AddNodeToPool race condition, missing MCP recovery verification, incorrect Eventually vs Consistently usage, and unfiltered MachineSet selection.	Fix AddNodeToPool label check, add explicit recovery assertions in OCP-88365, replace Eventually().ShouldNot() with Consistently().ShouldNot() in OCP-88708, and filter MachineSet by worker role label.
Microshift Test Compatibility	⚠️ Warning	The PR adds four new e2e tests (OCP-88122, OCP-88203, OCP-88365, OCP-88708) that use machine.openshift.io APIs (MachineConfigPool, MachineOSBuild, MachineSet) which are unavailable on MicroShift.	Add [apigroup:machine.openshift.io] tags to test names or [Skipped:MicroShift] labels to protect these tests from running in MicroShift CI environments.
Single Node Openshift (Sno) Test Compatibility	⚠️ Warning	Tests OCP-88122, OCP-88203, and OCP-88708 use AddNodeToPool() and GetWorkerNodes() to move worker nodes between MachineConfigPools, which would break SNO clusters by moving the only node. No SNO protection mechanisms ([Skipped:SingleReplicaTopology] labels or IsSingleNode() checks) are present.	Add [Skipped:SingleReplicaTopology] labels to multi-node tests or add exutil.IsSingleNode() checks with g.Skip() guards. Run tests against SNO CI job: periodic-ci-openshift-release-master-ci-4.22-e2e-aws-upgrade-ovn-single-node.

✅ Passed checks (9 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title '[MCO-1997] Add test for osImageStream' partially relates to the changeset. It captures one test file's addition (osImageStream tests) but omits the significant helper method added to MachineConfigPool, making it incomplete for the full scope of changes.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names	✅ Passed	All test names are stable and deterministic with no dynamic information like pod names, timestamps, UUIDs, or IP addresses in test titles.
Topology-Aware Scheduling Compatibility	✅ Passed	PR adds topology-aware test code and manifests for MachineConfigPool resources without introducing pod scheduling constraints that would break SNO, Two-Node, or HyperShift topologies.
Ote Binary Stdout Contract	✅ Passed	Both test helper files contain no process-level functions or process-level stdout writes that would violate the OTE Binary Stdout Contract.
Ipv6 And Disconnected Network Test Compatibility	✅ Passed	Test files contain no IPv4 hardcoded addresses, no IPv4-specific network operations, no external registry references, and no external connectivity requirements at test execution time.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

openshift-ci · 2026-04-24T13:08:46Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: ptalgulk01
Once this PR has been reviewed and has the lgtm label, please assign dkhater-redhat for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

coderabbitai

Actionable comments posted: 4

🧹 Nitpick comments (2)

test/extended-priv/mco_osimagestream.go (2)
317-319: Redundant waitForComplete after AddNodeToPool.

AddNodeToPool already calls WaitImmediateForUpdatedStatus internally, so this explicit waitForComplete adds only additional wait time without additional verification. If the extra wait is intentional to guard against the TDZ-style race flagged in AddNodeToPool, prefer hardening the helper instead of layering waits at each call site.
♻️ Proposed trim
 		exutil.By("Add a node to infra pool")
 		err = infraMcp.AddNodeToPool(mcp.GetSortedNodesOrFail()[0])
 		o.Expect(err).NotTo(o.HaveOccurred(), "Error adding node to infra pool")
 		logger.Infof("OK!\n")
 
-		exutil.By("Wait for infra pool to complete update with the new node")
-		infraMcp.waitForComplete()
-		logger.Infof("OK!\n")
-
 		exutil.By("Verify node in infra pool inherited rhel-10 from worker pool")
 		validateOsImageStreamInPool(infraMcp, osis, OSImageStreamRHEL10)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/extended-priv/mco_osimagestream.go` around lines 317 - 319, The call
site uses infraMcp.waitForComplete() immediately after AddNodeToPool, which is
redundant because AddNodeToPool already invokes WaitImmediateForUpdatedStatus;
remove the extra infraMcp.waitForComplete() call (or the line invoking
logger.Infof("OK!\n") should remain but without the redundant wait) and instead,
if guarding against the TDZ-style race is required, harden
AddNodeToPool/WaitImmediateForUpdatedStatus itself (update AddNodeToPool to
perform any additional verification or retry logic around
WaitImmediateForUpdatedStatus) so callers like this test (using infraMcp and
AddNodeToPool) do not need to layer waits.
469-481: Tighten test robustness against controller log message changes.

The test hardcodes "machineset tc-%s-dup has unsupported stream" at line 475, while the actual controller log is "machineset %s has unsupported stream: %v, skipping boot image update" (per pkg/controller/bootimage/ms_helpers.go:126). Using ContainSubstring at lines 505/518 masks this coupling—the test currently passes because its substring is contained in the actual log, but if the controller reformats the message (capitalization, phrasing, additional fields), the test becomes brittle and may silently break.

For consistency with line 541, which already uses the looser "has unsupported stream" substring, consider updating lines 475/505/518 to match on a stable substring anchor (e.g., duplicateMsName + " has unsupported stream" or just "has unsupported stream") rather than the full hardcoded message.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/extended-priv/mco_osimagestream.go` around lines 469 - 481, Replace the
overly specific expected log string in the test and assertions with a stable
substring anchor: update the variable unsupportedStreamLogMsg (and any
assertions referencing it) to use a looser match such as duplicateMsName + " has
unsupported stream" or simply "has unsupported stream", and ensure the
ContainSubstring checks against controller log output (where NewController,
duplicateMsName and unsupportedStreamLogMsg are used) use that substring to
avoid brittle exact-format coupling with ms_helpers.go log text.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@test/extended-priv/machineconfigpool.go`:
- Around line 1552-1577: In AddNodeToPool, avoid the race where a pre-labeled
node inflates currentNodes and causes WaitForMachineCount to hang: first call
node.GetLabel("node-role.kubernetes.io/<pool>") (use mcp.GetName() to build the
key) and only call node.AddLabel(...) if the label is absent; after successfully
adding the label (or if you detect a fresh addition), wait for the MCP to see
the new machine with mcp.WaitForMachineCount(expectedCount, ...), then instead
of immediately calling WaitImmediateForUpdatedStatus, poll the MCP for
Updating==True (or assert updatedMachineCount == machineCount) as a handshake
before asserting Updated==True (so replace or precede the
WaitImmediateForUpdatedStatus call with a short WaitForUpdating/condition check
to ensure reconciliation started).

In `@test/extended-priv/mco_osimagestream.go`:
- Around line 456-466: After asserting the MCP is degraded, remove the
conflicting MachineConfig (call the same removal used in cleanup, e.g. invoke
mc.DeleteWithWait() or DeleteCustomMCP()) and then explicitly assert recovery by
polling the customMcp: use o.Eventually(customMcp, "5m",
"20s").Should(HaveConditionField("RenderDegraded", "status", FalseString)) and
also check Updated condition true (e.g. HaveConditionField("Updated", "status",
TrueString)) to ensure the MCP transitions back to healthy; place these
assertions after deleting the MC and before relying on deferred cleanup to catch
regressions.
- Around line 517-521: The test currently uses Eventually(controller.GetLogs,
"2m", "10s").ShouldNot(o.ContainSubstring(unsupportedStreamLogMsg)) which can
pass immediately because controller.IgnoreLogsBeforeNowOrFail() clears logs;
change this to Consistently(controller.GetLogs, "2m",
"10s").ShouldNot(o.ContainSubstring(unsupportedStreamLogMsg)) so the assertion
verifies the unsupportedStreamLogMsg never appears for the entire duration;
update the call site where controller.GetLogs and unsupportedStreamLogMsg are
referenced and ensure the same polling interval arguments are used with
Consistently instead of Eventually.
- Around line 483-492: The code selects the first MachineSet from existingMsList
and doesn't handle delete errors; update the selection to filter only worker
MachineSets by calling
ByLabel("machine.openshift.io/cluster-api-machine-role=worker") on the
NewMachineSetList(oc.AsAdmin(), MachineAPINamespace) result before using
existingMsList[0], and change the deferred cleanup to call duplicateMs.Delete()
and log any returned error (e.g., check the error returned by
duplicateMs.Delete() and logger.Errorf with duplicateMsName and the error)
instead of discarding it.

---

Nitpick comments:
In `@test/extended-priv/mco_osimagestream.go`:
- Around line 317-319: The call site uses infraMcp.waitForComplete() immediately
after AddNodeToPool, which is redundant because AddNodeToPool already invokes
WaitImmediateForUpdatedStatus; remove the extra infraMcp.waitForComplete() call
(or the line invoking logger.Infof("OK!\n") should remain but without the
redundant wait) and instead, if guarding against the TDZ-style race is required,
harden AddNodeToPool/WaitImmediateForUpdatedStatus itself (update AddNodeToPool
to perform any additional verification or retry logic around
WaitImmediateForUpdatedStatus) so callers like this test (using infraMcp and
AddNodeToPool) do not need to layer waits.
- Around line 469-481: Replace the overly specific expected log string in the
test and assertions with a stable substring anchor: update the variable
unsupportedStreamLogMsg (and any assertions referencing it) to use a looser
match such as duplicateMsName + " has unsupported stream" or simply "has
unsupported stream", and ensure the ContainSubstring checks against controller
log output (where NewController, duplicateMsName and unsupportedStreamLogMsg are
used) use that substring to avoid brittle exact-format coupling with
ms_helpers.go log text.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: d3499c12-6325-4124-af9e-d28cce7052b6

📥 Commits

Reviewing files that changed from the base of the PR and between 5f1b7ad and 043492b.

📒 Files selected for processing (2)

test/extended-priv/machineconfigpool.go
test/extended-priv/mco_osimagestream.go

coderabbitai · 2026-04-24T13:24:35Z

+// AddNodeToPool adds a node to the MCP and waits for the MCP to recognize it
+func (mcp *MachineConfigPool) AddNodeToPool(node *Node) error {
+	currentNodes, err := mcp.GetNodes()
+	if err != nil {
+		return fmt.Errorf("error getting current nodes from %s: %w", mcp.GetName(), err)
+	}
+	expectedCount := len(currentNodes) + 1
+
+	err = node.AddLabel(fmt.Sprintf("node-role.kubernetes.io/%s", mcp.GetName()), "")
+	if err != nil {
+		return fmt.Errorf("error labeling node %s to add it to pool %s: %w", node.GetName(), mcp.GetName(), err)
+	}
+	logger.Infof("Node %s added to pool %s", node.GetName(), mcp.GetName())
+
+	err = mcp.WaitForMachineCount(expectedCount, 5*time.Minute)
+	if err != nil {
+		return fmt.Errorf("the %s MCP is not reporting the expected machine count: %w", mcp.GetName(), err)
+	}
+
+	err = mcp.WaitImmediateForUpdatedStatus()
+	if err != nil {
+		return fmt.Errorf("the %s MCP is not updated: %w", mcp.GetName(), err)
+	}
+
+	return nil
+}


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash ast-grep --pattern $'func (n *Node) HasLabel($_, $_) $_' rg -nP --type=go -C2 '\bfunc $.*Node$ (Has|Get)Label\b'

Repository: openshift/machine-config-operator

Length of output: 59

🏁 Script executed:

# Find the Node type definition and its methods fd -e go test/extended-priv/ head -100 test/extended-priv/machineconfigpool.go # Search for Node type definition rg -A 20 "^type Node struct" test/extended-priv/ # Search for any label-related methods on Node rg "func.*Node.*Label" test/extended-priv/ # Check what methods Node actually has rg "func $.*\*?Node$" test/extended-priv/ | head -30

Repository: openshift/machine-config-operator

Length of output: 7984

🏁 Script executed:

# Also search for where AddLabel is implemented/used elsewhere rg -B 5 -A 10 "AddLabel" test/extended-priv/ | head -80

Repository: openshift/machine-config-operator

Length of output: 5884

🏁 Script executed:

# Check how WaitImmediateForUpdatedStatus is implemented rg -A 15 "func.*WaitImmediateForUpdatedStatus" test/extended-priv/ # Check WaitForMachineCount implementation rg -A 15 "func.*WaitForMachineCount" test/extended-priv/ # Check GetNodes to understand what it returns rg -A 10 "func.*GetNodes\b" test/extended-priv/

Repository: openshift/machine-config-operator

Length of output: 5386

Consider the race condition with immediate polling and pre-existing node membership.

Two minor concerns with this helper:

Pre-existing membership timeout: If the node already has the node-role.kubernetes.io/<pool> label, it will already be included in currentNodes and the expectation of len(currentNodes) + 1 machines will never be satisfied, causing WaitForMachineCount to time out. A check using GetLabel() before labeling would avoid this unnecessary wait.

Race with WaitImmediateForUpdatedStatus: The immediate polling (immediate=true) means WaitImmediateForUpdatedStatus can observe the stale Updated=True status from a prior quiesced state, returning success before the MCP controller has actually begun reconciling the newly-added node. While WaitForMachineCount helps bridge this, consider explicitly waiting for Updating=True first as a handshake before checking for Updated=True, or use updatedMachineCount == machineCount as a synchronization signal.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@test/extended-priv/machineconfigpool.go` around lines 1552 - 1577, In AddNodeToPool, avoid the race where a pre-labeled node inflates currentNodes and causes WaitForMachineCount to hang: first call node.GetLabel("node-role.kubernetes.io/<pool>") (use mcp.GetName() to build the key) and only call node.AddLabel(...) if the label is absent; after successfully adding the label (or if you detect a fresh addition), wait for the MCP to see the new machine with mcp.WaitForMachineCount(expectedCount, ...), then instead of immediately calling WaitImmediateForUpdatedStatus, poll the MCP for Updating==True (or assert updatedMachineCount == machineCount) as a handshake before asserting Updated==True (so replace or precede the WaitImmediateForUpdatedStatus call with a short WaitForUpdating/condition check to ensure reconciliation started).

coderabbitai · 2026-04-24T13:24:35Z

+		exutil.By("Verify MCP is degraded with error about osImageURL and osImageStream conflict")
+		o.Eventually(customMcp, "5m", "20s").Should(HaveConditionField("RenderDegraded", "status", TrueString),
+			"MCP should be degraded when both osImageURL and osImageStream are set")
+
+		expectedErrorMsg := "cannot override MachineConfig osImageURL and set MachineConfigPool spec.osImageStream.name simultaneously"
+		o.Eventually(customMcp.Get, "2m", "10s").WithArguments(`{.status.conditions[?(@.type=="RenderDegraded")].message}`).Should(
+			o.ContainSubstring(expectedErrorMsg),
+			"MCP degraded message should mention osImageURL and osImageStream conflict")
+		logger.Infof("MCP correctly degraded with expected error message")
+		logger.Infof("OK!\n")
+	})


⚠️ Potential issue | 🟡 Minor

Missing explicit recovery assertion after removing the conflicting MC.

The PR summary for OCP-88365 states this test "tests recovery after removing the conflict", but the body only asserts the degraded state and relies on the deferred mc.DeleteWithWait() / DeleteCustomMCP() for cleanup — it doesn't Eventually assert that the MCP transitions back to RenderDegraded=False and Updated=True after the conflicting MC is removed. Without that assertion, a regression where the pool stays wedged in degraded would be silently masked (cleanup would likely time out or log-only error in defer).

Consider adding an explicit "remove the MC and verify recovery" step before the defers kick in.

🛡️ Suggested additional step

logger.Infof("MCP correctly degraded with expected error message") logger.Infof("OK!\n") + + exutil.By("Remove the conflicting MachineConfig and verify MCP recovers") + mc.DeleteWithWait() + o.Eventually(customMcp, "10m", "20s").Should(HaveConditionField("RenderDegraded", "status", FalseString), + "MCP should recover from RenderDegraded after removing the conflicting MC") + o.Expect(customMcp.WaitForUpdatedStatus()).To(o.Succeed(), "MCP should become Updated after recovery") + logger.Infof("OK!\n") })

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@test/extended-priv/mco_osimagestream.go` around lines 456 - 466, After asserting the MCP is degraded, remove the conflicting MachineConfig (call the same removal used in cleanup, e.g. invoke mc.DeleteWithWait() or DeleteCustomMCP()) and then explicitly assert recovery by polling the customMcp: use o.Eventually(customMcp, "5m", "20s").Should(HaveConditionField("RenderDegraded", "status", FalseString)) and also check Updated condition true (e.g. HaveConditionField("Updated", "status", TrueString)) to ensure the MCP transitions back to healthy; place these assertions after deleting the MC and before relying on deferred cleanup to catch regressions.

coderabbitai · 2026-04-24T13:24:35Z

+		exutil.By(fmt.Sprintf("Create duplicate MachineSet %s from existing worker MachineSet", duplicateMsName))
+		existingMsList := NewMachineSetList(oc.AsAdmin(), MachineAPINamespace).GetAllOrFail()
+		o.Expect(len(existingMsList)).To(o.BeNumerically(">", 0), "At least one MachineSet should exist")
+		duplicateMs, err = existingMsList[0].Duplicate(duplicateMsName)
+		o.Expect(err).NotTo(o.HaveOccurred(), "Error duplicating MachineSet")
+		defer func() {
+			logger.Infof("Cleaning up duplicate MachineSet %s", duplicateMsName)
+			duplicateMs.Delete()
+		}()
+		logger.Infof("OK!\n")


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Find how MachineSet role is inspected elsewhere rg -nP --type=go -C2 '(cluster-api-machine-role|machine\.openshift\.io/cluster-api-machine-role)' ast-grep --pattern $'func ($_ *MachineSet) $_() $$$'

Repository: openshift/machine-config-operator

Length of output: 6496

🏁 Script executed:

#!/bin/bash # Search for MachineSet.Get() usages to understand field path format rg -nP --type=go 'ms\.Get$' -A2 -B2 # Search for worker role references rg -nP --type=go '(worker|cluster-api-machine-role)' --glob='*test*' | head -30

Repository: openshift/machine-config-operator

Length of output: 11298

🏁 Script executed:

#!/bin/bash # Find MachineSet.Duplicate() and Delete() methods rg -nP --type=go 'func \(.*MachineSet.*$ (Duplicate|Delete)' -A3 # Search for existing worker filter patterns or helper functions rg -nP --type=go 'NewMachineSetList|GetAllOrFail|ByLabel' -A2 -B2 | head -50

Repository: openshift/machine-config-operator

Length of output: 3954

🏁 Script executed:

#!/bin/bash # Find the Delete() method for MachineSet rg -nP --type=go 'func $.*MachineSet.*$ Delete' -A3 # Search for ByLabel method on MachineSetList rg -nP --type=go 'func.*MachineSetList.*ByLabel' -A3 # Check CloneResource to understand Duplicate's implementation rg -nP --type=go 'func CloneResource' -A10

Repository: openshift/machine-config-operator

Length of output: 852

🏁 Script executed:

#!/bin/bash # Find Delete method more broadly rg -nP --type=go 'Delete.*' test/extended-priv/machineset.go -B2 -A5 # Also search in resource.go for Delete method rg -nP --type=go 'func $.*$ Delete' test/extended-priv/resource.go -B2 -A5 # Look for MachineSetList ByLabel rg -nP --type=go '\.ByLabel' test/extended-priv/machineset.go -B2 -A2 # Check if Delete is called elsewhere in the codebase rg -nP --type=go 'duplicateMs\.Delete|\.Delete' test/extended-priv/ -B1 -A1

Repository: openshift/machine-config-operator

Length of output: 16087

🏁 Script executed:

#!/bin/bash # Find ByLabel implementation on ResourceList or MachineSetList rg -nP --type=go 'func.*ByLabel' test/extended-priv/ -A3 # Check what labels are used to identify worker MachineSets rg -nP --type=go 'worker.*MachineSet|MachineSet.*worker' test/extended-priv/ -B2 -A2 # Look for role-based filtering in MachineSet selection rg -nP --type=go 'cluster-api-machine-role' test/extended-priv/ -B2 -A2

Repository: openshift/machine-config-operator

Length of output: 2623

🏁 Script executed:

#!/bin/bash # Check if there are other instances of filtering worker MachineSets rg -nP --type=go 'MachineSetList.*GetAll|NewMachineSetList' test/extended-priv/ -A5 | head -40 # Look for worker pool constants rg -nP --type=go 'MachineConfigPoolWorker|MachineAPINamespace' test/extended-priv/const.go -B2 -A2

Repository: openshift/machine-config-operator

Length of output: 3399

Filter MachineSet selection to worker role and handle the cleanup deletion error.

Two issues in this block:

existingMsList[0] picks the first MachineSet in API order, not necessarily a worker. Clusters with infra or other MachineSets will cause subsequent assertions tied to osstream labels to fail or produce misleading results. The comment itself says "from existing worker MachineSet"—filter explicitly using:

existingMsList.ByLabel("machine.openshift.io/cluster-api-machine-role=worker")

The deferred duplicateMs.Delete() silently discards errors. If deletion fails, the duplicate MachineSet and its provisioned machines leak into subsequent tests. Log the error:

if derr := duplicateMs.Delete(); derr != nil { logger.Errorf("Failed to delete duplicate MachineSet %s: %v", duplicateMsName, derr) }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@test/extended-priv/mco_osimagestream.go` around lines 483 - 492, The code selects the first MachineSet from existingMsList and doesn't handle delete errors; update the selection to filter only worker MachineSets by calling ByLabel("machine.openshift.io/cluster-api-machine-role=worker") on the NewMachineSetList(oc.AsAdmin(), MachineAPINamespace) result before using existingMsList[0], and change the deferred cleanup to call duplicateMs.Delete() and log any returned error (e.g., check the error returned by duplicateMs.Delete() and logger.Errorf with duplicateMsName and the error) instead of discarding it.

coderabbitai · 2026-04-24T13:24:35Z

+		exutil.By("Verify controller no longer logs unsupported stream message for this MachineSet")
+		o.Eventually(controller.GetLogs, "2m", "10s").ShouldNot(o.ContainSubstring(unsupportedStreamLogMsg),
+			"Controller should not log unsupported stream message after changing to supported stream")
+		logger.Infof("Controller correctly stopped logging unsupported stream for %s", duplicateMsName)
+		logger.Infof("OK!\n")


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

Gomega Eventually ShouldNot vs Consistently ShouldNot semantics for asserting a condition never becomes true

💡 Result:

Gomega's Eventually.ShouldNot and Consistently.ShouldNot have distinct semantics for asserting a condition never becomes true. Eventually(func Condition).ShouldNot(Matcher) polls until the matcher FAILS (i.e., Condition does NOT match Matcher) or timeout. It succeeds if the matcher never succeeds within the timeout period. However, if the condition briefly matches the matcher at any poll, it succeeds immediately. This is rarely useful for "never becomes true" because any transient match causes success; it's more for expecting the negation to eventually hold stably. Consistently(func Condition).ShouldNot(Matcher) polls for the full duration, succeeding only if the matcher FAILS at EVERY poll (i.e., condition never matches). It fails immediately if the matcher ever succeeds during polling. This is the idiomatic way to assert a condition "never becomes true" over a time window. Example from official docs for Consistently.ShouldNot: Consistently(channel, "200ms").ShouldNot(Receive) // asserts nothing received for 200ms For Eventually.ShouldNot, no direct "never" example in docs; its use is niche since it succeeds on first non-match but requires non-match to persist until timeout without ever matching. To assert "never becomes true" indefinitely, use longer durations with Consistently or combine with other patterns like separate Eventually(finished) then Expect(successful).

Citations:

1: https://onsi.github.io/gomega/

2: https://github.com/onsi/gomega/blob/master/gomega_dsl.go

3: https://onsi.github.io/gomega

4: https://github.com/onsi/gomega/blob/3eef0d7813628f9e4bcfb93e6dd3e8cd12342cd6/gomega_dsl.go

🏁 Script executed:

cd test/extended-priv && wc -l mco_osimagestream.go

Repository: openshift/machine-config-operator

Length of output: 102

🏁 Script executed:

cd test/extended-priv && sed -n '515,545p' mco_osimagestream.go

Repository: openshift/machine-config-operator

Length of output: 1659

Replace Eventually(...).ShouldNot(...) with Consistently(...).ShouldNot(...) for log-absence verification.

After controller.IgnoreLogsBeforeNowOrFail() clears logs, GetLogs() returns an empty buffer that trivially does not contain the target substring. Eventually(...).ShouldNot(...) succeeds immediately on the first poll and never verifies the log stays absent over the full 2-minute window. A regression where the controller emits the unsupported-stream log after switching to a supported stream would not be caught.

Consistently(...).ShouldNot(...) enforces the matcher fails at every poll throughout the entire duration, which is the correct assertion for "this message must never appear."

Proposed fix

exutil.By("Verify controller no longer logs unsupported stream message for this MachineSet") - o.Eventually(controller.GetLogs, "2m", "10s").ShouldNot(o.ContainSubstring(unsupportedStreamLogMsg), + o.Consistently(controller.GetLogs, "2m", "10s").ShouldNot(o.ContainSubstring(unsupportedStreamLogMsg), "Controller should not log unsupported stream message after changing to supported stream")

exutil.By("Verify boot image controller does not log unsupported stream for CPMS") controller.IgnoreLogsBeforeNowOrFail() - o.Eventually(controller.GetLogs, "2m", "10s").ShouldNot(o.ContainSubstring("has unsupported stream"), + o.Consistently(controller.GetLogs, "2m", "10s").ShouldNot(o.ContainSubstring("has unsupported stream"), "Boot image controller should not log unsupported stream message for CPMS")

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

exutil.By("Verify controller no longer logs unsupported stream message for this MachineSet")

o.Eventually(controller.GetLogs, "2m", "10s").ShouldNot(o.ContainSubstring(unsupportedStreamLogMsg),

"Controller should not log unsupported stream message after changing to supported stream")

logger.Infof("Controller correctly stopped logging unsupported stream for %s", duplicateMsName)

logger.Infof("OK!\n")

exutil.By("Verify controller no longer logs unsupported stream message for this MachineSet")

o.Consistently(controller.GetLogs, "2m", "10s").ShouldNot(o.ContainSubstring(unsupportedStreamLogMsg),

"Controller should not log unsupported stream message after changing to supported stream")

logger.Infof("Controller correctly stopped logging unsupported stream for %s", duplicateMsName)

logger.Infof("OK!\n")

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@test/extended-priv/mco_osimagestream.go` around lines 517 - 521, The test currently uses Eventually(controller.GetLogs, "2m", "10s").ShouldNot(o.ContainSubstring(unsupportedStreamLogMsg)) which can pass immediately because controller.IgnoreLogsBeforeNowOrFail() clears logs; change this to Consistently(controller.GetLogs, "2m", "10s").ShouldNot(o.ContainSubstring(unsupportedStreamLogMsg)) so the assertion verifies the unsupportedStreamLogMsg never appears for the entire duration; update the call site where controller.GetLogs and unsupportedStreamLogMsg are referenced and ensure the same polling interval arguments are used with Consistently instead of Eventually.

openshift-ci · 2026-04-24T16:17:30Z

@ptalgulk01: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
ci/prow/unit	`043492b`	link	true	`/test unit`

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

sergiordlr · 2026-04-27T08:59:32Z

 }
+
+// AddNodeToPool adds a node to the MCP and waits for the MCP to recognize it
+func (mcp *MachineConfigPool) AddNodeToPool(node *Node) error {


We can't add nodes to a master pool, for example. So we can't define this function as a general method in a MCP.

It would be better if we write it as an independent function accepting a MCP as parameter, and not as a method in the MCP struct.

This code is very similar to the one in CreateCustomMCPWithStreamByNodes

We can make the function to add several nodes, not only one, to the poo and reuse it in CreateCustomMCPWithStreamByNodes avoiding duplicated code.

sergiordlr · 2026-04-27T09:32:32Z

+			g.Skip("The cluster is SNO/Compact. This test cannot be executed in SNO/Compact clusters")
+		}


Let's use the SkipIfCompactOrSNO function that is used in the private repo. It seems it hasn't been migrated yet, but it should be migrated anyway.

// SkipIfCompactOrSNO skips the test case if the cluster is a compact or SNO cluster func SkipIfCompactOrSNO(oc *exutil.CLI) { if IsCompactOrSNOCluster(oc) { g.Skip("The test is not supported in Compact or SNO clusters") } }

sergiordlr · 2026-04-27T10:04:11Z

+		if IsCompactOrSNOCluster(oc.AsAdmin()) {
+			g.Skip("The cluster is SNO/Compact. This test cannot be executed in SNO/Compact clusters")
+		}


Same, let's migrate and use the SkipIfCompactOrSNO function

sergiordlr · 2026-04-27T10:09:49Z

+		o.Eventually(func() string {
+			currentMOSB, err := mosc.GetCurrentMachineOSBuild()
+			if err != nil {
+				return ""
+			}
+			return currentMOSB.GetName()
+		}, "5m", "20s").ShouldNot(o.Equal(initialMOSBName), "A new MachineOSBuild should be triggered after osImageStream change")


It is cleaner if we don't swallow the error silently

o.Eventually(func() (string, error) { mosb, err := mosc.GetCurrentMachineOSBuild() if err != nil { return "", err } return mosb.GetName(), nil }, "5m", "20s").ShouldNot(o.Equal(initialMOSBName), "A new MachineOSBuild should be triggered after osImageStream change")

sergiordlr · 2026-04-27T10:10:43Z

+		if IsCompactOrSNOCluster(oc.AsAdmin()) {
+			g.Skip("The cluster is SNO/Compact. This test cannot be executed in SNO/Compact clusters")
+		}


Same let's use the SkipIfCompactOrSNO function

sergiordlr · 2026-04-27T10:19:02Z

+		logger.Infof("OK!\n")
+
+		exutil.By(fmt.Sprintf("Set osstream label to %s on MachineSet", OSImageStreamRHEL10))
+		err = duplicateMs.Patch("merge", fmt.Sprintf(`{"metadata":{"labels":{"machineconfiguration.openshift.io/osstream":"%s"}}}`, OSImageStreamRHEL10))


We need a method for this, so that it can be reused in future tests.

We can use AdLabel instead. But I think it would be better to use even a more specific method so that we don't have to care about the name of the label.

duplicateMs.AddOSStreamLabel(...)

Then this function can call "AddLabel".

sergiordlr · 2026-04-27T10:23:28Z

+		// Reset log checkpoint before changing to supported stream to verify the new behavior
+		exutil.By("Change osstream label to rhel-9 and verify no unsupported stream log is triggered")
+		controller.IgnoreLogsBeforeNowOrFail()
+		err = duplicateMs.Patch("merge", fmt.Sprintf(`{"metadata":{"labels":{"machineconfiguration.openshift.io/osstream":"%s"}}}`, OSImageStreamRHEL9))


Same as above.

sergiordlr · 2026-04-27T10:26:29Z

+		exutil.By("Enable CPMS for managed boot images")
+		cpms = NewControlPlaneMachineSet(oc.AsAdmin(), MachineAPINamespace, ControlPlaneMachineSetName)
+		o.Expect(cpms.Exists()).To(o.BeTrue(), "ControlPlaneMachineSet should exist for this test")
+
+		machineConfig = GetMachineConfiguration(oc.AsAdmin())
+		defer machineConfig.RemoveManagedBootImagesConfig()
+
+		o.Expect(
+			machineConfig.SetAllManagedBootImagesConfig(ControlPlaneMachineSetResource),
+		).To(o.Succeed(), "Error enabling CPMS for managed boot images")
+		logger.Infof("OK!\n")
+


CPMS and Machinesets are two different functionalities.

We should use a different test case for the CPMS

sergiordlr · 2026-04-27T10:42:11Z

+		err = NewMCOTemplate(oc.AsAdmin(), "custom-machine-config-pool.yaml").Create(
+			"-p", fmt.Sprintf("NAME=%s", infraName),
+		)


Call CreateCustomMCP

sergiordlr · 2026-04-27T10:43:06Z

+
+		exutil.By("Create custom MCP with rhel-9 configured")
+		defer DeleteCustomMCP(oc.AsAdmin(), mcp1Name)
+		mcp1, err := CreateCustomMCP(oc.AsAdmin(), mcp1Name, 1)


CreateCustomMCPWithStream(oc.AsAdmin(), mcp2Name, OSImageStreamRHEL9, 1)

openshift-ci Bot requested review from RishabhSaini and umohnani8 April 24, 2026 13:08

coderabbitai Bot reviewed Apr 24, 2026

View reviewed changes

sergiordlr reviewed Apr 27, 2026

View reviewed changes

		g.Skip("The cluster is SNO/Compact. This test cannot be executed in SNO/Compact clusters")
		}

Conversation

ptalgulk01 commented Apr 24, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Test Cases Added

Implementation Details

Summary by CodeRabbit

Uh oh!

openshift-merge-bot Bot commented Apr 24, 2026

Uh oh!

coderabbitai Bot commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (3 warnings)

Uh oh!

openshift-ci Bot commented Apr 24, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

openshift-ci Bot commented Apr 24, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ptalgulk01 commented Apr 24, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 24, 2026 •

edited

Loading