[MCO-1997] Add test for osImageStream#5881
Conversation
Add OCP-88122 test case to validate that custom MachineConfigPools dynamically inherit osImageStream from worker pool when not explicitly configured. Test validates three scenarios: 1. Custom MCPs inherit worker pool's default osImageStream (rhel-9) 2. Custom MCP created before worker pool osImageStream change inherits the new value (rhel-10) dynamically 3. Custom MCPs can override with explicit osImageStream values Also adds AddNodeToPool helper method to simplify adding nodes to custom pools during testing. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
|
Pipeline controller notification For optional jobs, comment This repository is configured in: LGTM mode |
WalkthroughAdds a test helper method for moving nodes into machine config pools and introduces four new disruptive test cases validating osImageStream behavior across MachineConfigPools, MachineOSBuild triggering, conflict detection, and boot-image-controller logging. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 9 | ❌ 3❌ Failed checks (3 warnings)
✅ Passed checks (9 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: ptalgulk01 The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (2)
test/extended-priv/mco_osimagestream.go (2)
317-319: RedundantwaitForCompleteafterAddNodeToPool.
AddNodeToPoolalready callsWaitImmediateForUpdatedStatusinternally, so this explicitwaitForCompleteadds only additional wait time without additional verification. If the extra wait is intentional to guard against the TDZ-style race flagged inAddNodeToPool, prefer hardening the helper instead of layering waits at each call site.♻️ Proposed trim
exutil.By("Add a node to infra pool") err = infraMcp.AddNodeToPool(mcp.GetSortedNodesOrFail()[0]) o.Expect(err).NotTo(o.HaveOccurred(), "Error adding node to infra pool") logger.Infof("OK!\n") - exutil.By("Wait for infra pool to complete update with the new node") - infraMcp.waitForComplete() - logger.Infof("OK!\n") - exutil.By("Verify node in infra pool inherited rhel-10 from worker pool") validateOsImageStreamInPool(infraMcp, osis, OSImageStreamRHEL10)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/extended-priv/mco_osimagestream.go` around lines 317 - 319, The call site uses infraMcp.waitForComplete() immediately after AddNodeToPool, which is redundant because AddNodeToPool already invokes WaitImmediateForUpdatedStatus; remove the extra infraMcp.waitForComplete() call (or the line invoking logger.Infof("OK!\n") should remain but without the redundant wait) and instead, if guarding against the TDZ-style race is required, harden AddNodeToPool/WaitImmediateForUpdatedStatus itself (update AddNodeToPool to perform any additional verification or retry logic around WaitImmediateForUpdatedStatus) so callers like this test (using infraMcp and AddNodeToPool) do not need to layer waits.
469-481: Tighten test robustness against controller log message changes.The test hardcodes
"machineset tc-%s-dup has unsupported stream"at line 475, while the actual controller log is"machineset %s has unsupported stream: %v, skipping boot image update"(perpkg/controller/bootimage/ms_helpers.go:126). UsingContainSubstringat lines 505/518 masks this coupling—the test currently passes because its substring is contained in the actual log, but if the controller reformats the message (capitalization, phrasing, additional fields), the test becomes brittle and may silently break.For consistency with line 541, which already uses the looser
"has unsupported stream"substring, consider updating lines 475/505/518 to match on a stable substring anchor (e.g.,duplicateMsName + " has unsupported stream"or just"has unsupported stream") rather than the full hardcoded message.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/extended-priv/mco_osimagestream.go` around lines 469 - 481, Replace the overly specific expected log string in the test and assertions with a stable substring anchor: update the variable unsupportedStreamLogMsg (and any assertions referencing it) to use a looser match such as duplicateMsName + " has unsupported stream" or simply "has unsupported stream", and ensure the ContainSubstring checks against controller log output (where NewController, duplicateMsName and unsupportedStreamLogMsg are used) use that substring to avoid brittle exact-format coupling with ms_helpers.go log text.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@test/extended-priv/machineconfigpool.go`:
- Around line 1552-1577: In AddNodeToPool, avoid the race where a pre-labeled
node inflates currentNodes and causes WaitForMachineCount to hang: first call
node.GetLabel("node-role.kubernetes.io/<pool>") (use mcp.GetName() to build the
key) and only call node.AddLabel(...) if the label is absent; after successfully
adding the label (or if you detect a fresh addition), wait for the MCP to see
the new machine with mcp.WaitForMachineCount(expectedCount, ...), then instead
of immediately calling WaitImmediateForUpdatedStatus, poll the MCP for
Updating==True (or assert updatedMachineCount == machineCount) as a handshake
before asserting Updated==True (so replace or precede the
WaitImmediateForUpdatedStatus call with a short WaitForUpdating/condition check
to ensure reconciliation started).
In `@test/extended-priv/mco_osimagestream.go`:
- Around line 456-466: After asserting the MCP is degraded, remove the
conflicting MachineConfig (call the same removal used in cleanup, e.g. invoke
mc.DeleteWithWait() or DeleteCustomMCP()) and then explicitly assert recovery by
polling the customMcp: use o.Eventually(customMcp, "5m",
"20s").Should(HaveConditionField("RenderDegraded", "status", FalseString)) and
also check Updated condition true (e.g. HaveConditionField("Updated", "status",
TrueString)) to ensure the MCP transitions back to healthy; place these
assertions after deleting the MC and before relying on deferred cleanup to catch
regressions.
- Around line 517-521: The test currently uses Eventually(controller.GetLogs,
"2m", "10s").ShouldNot(o.ContainSubstring(unsupportedStreamLogMsg)) which can
pass immediately because controller.IgnoreLogsBeforeNowOrFail() clears logs;
change this to Consistently(controller.GetLogs, "2m",
"10s").ShouldNot(o.ContainSubstring(unsupportedStreamLogMsg)) so the assertion
verifies the unsupportedStreamLogMsg never appears for the entire duration;
update the call site where controller.GetLogs and unsupportedStreamLogMsg are
referenced and ensure the same polling interval arguments are used with
Consistently instead of Eventually.
- Around line 483-492: The code selects the first MachineSet from existingMsList
and doesn't handle delete errors; update the selection to filter only worker
MachineSets by calling
ByLabel("machine.openshift.io/cluster-api-machine-role=worker") on the
NewMachineSetList(oc.AsAdmin(), MachineAPINamespace) result before using
existingMsList[0], and change the deferred cleanup to call duplicateMs.Delete()
and log any returned error (e.g., check the error returned by
duplicateMs.Delete() and logger.Errorf with duplicateMsName and the error)
instead of discarding it.
---
Nitpick comments:
In `@test/extended-priv/mco_osimagestream.go`:
- Around line 317-319: The call site uses infraMcp.waitForComplete() immediately
after AddNodeToPool, which is redundant because AddNodeToPool already invokes
WaitImmediateForUpdatedStatus; remove the extra infraMcp.waitForComplete() call
(or the line invoking logger.Infof("OK!\n") should remain but without the
redundant wait) and instead, if guarding against the TDZ-style race is required,
harden AddNodeToPool/WaitImmediateForUpdatedStatus itself (update AddNodeToPool
to perform any additional verification or retry logic around
WaitImmediateForUpdatedStatus) so callers like this test (using infraMcp and
AddNodeToPool) do not need to layer waits.
- Around line 469-481: Replace the overly specific expected log string in the
test and assertions with a stable substring anchor: update the variable
unsupportedStreamLogMsg (and any assertions referencing it) to use a looser
match such as duplicateMsName + " has unsupported stream" or simply "has
unsupported stream", and ensure the ContainSubstring checks against controller
log output (where NewController, duplicateMsName and unsupportedStreamLogMsg are
used) use that substring to avoid brittle exact-format coupling with
ms_helpers.go log text.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: d3499c12-6325-4124-af9e-d28cce7052b6
📒 Files selected for processing (2)
test/extended-priv/machineconfigpool.gotest/extended-priv/mco_osimagestream.go
| // AddNodeToPool adds a node to the MCP and waits for the MCP to recognize it | ||
| func (mcp *MachineConfigPool) AddNodeToPool(node *Node) error { | ||
| currentNodes, err := mcp.GetNodes() | ||
| if err != nil { | ||
| return fmt.Errorf("error getting current nodes from %s: %w", mcp.GetName(), err) | ||
| } | ||
| expectedCount := len(currentNodes) + 1 | ||
|
|
||
| err = node.AddLabel(fmt.Sprintf("node-role.kubernetes.io/%s", mcp.GetName()), "") | ||
| if err != nil { | ||
| return fmt.Errorf("error labeling node %s to add it to pool %s: %w", node.GetName(), mcp.GetName(), err) | ||
| } | ||
| logger.Infof("Node %s added to pool %s", node.GetName(), mcp.GetName()) | ||
|
|
||
| err = mcp.WaitForMachineCount(expectedCount, 5*time.Minute) | ||
| if err != nil { | ||
| return fmt.Errorf("the %s MCP is not reporting the expected machine count: %w", mcp.GetName(), err) | ||
| } | ||
|
|
||
| err = mcp.WaitImmediateForUpdatedStatus() | ||
| if err != nil { | ||
| return fmt.Errorf("the %s MCP is not updated: %w", mcp.GetName(), err) | ||
| } | ||
|
|
||
| return nil | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
ast-grep --pattern $'func (n *Node) HasLabel($_, $_) $_'
rg -nP --type=go -C2 '\bfunc \(.*Node\) (Has|Get)Label\b'Repository: openshift/machine-config-operator
Length of output: 59
🏁 Script executed:
# Find the Node type definition and its methods
fd -e go test/extended-priv/
head -100 test/extended-priv/machineconfigpool.go
# Search for Node type definition
rg -A 20 "^type Node struct" test/extended-priv/
# Search for any label-related methods on Node
rg "func.*Node.*Label" test/extended-priv/
# Check what methods Node actually has
rg "func \(.*\*?Node\)" test/extended-priv/ | head -30Repository: openshift/machine-config-operator
Length of output: 7984
🏁 Script executed:
# Also search for where AddLabel is implemented/used elsewhere
rg -B 5 -A 10 "AddLabel" test/extended-priv/ | head -80Repository: openshift/machine-config-operator
Length of output: 5884
🏁 Script executed:
# Check how WaitImmediateForUpdatedStatus is implemented
rg -A 15 "func.*WaitImmediateForUpdatedStatus" test/extended-priv/
# Check WaitForMachineCount implementation
rg -A 15 "func.*WaitForMachineCount" test/extended-priv/
# Check GetNodes to understand what it returns
rg -A 10 "func.*GetNodes\b" test/extended-priv/Repository: openshift/machine-config-operator
Length of output: 5386
Consider the race condition with immediate polling and pre-existing node membership.
Two minor concerns with this helper:
-
Pre-existing membership timeout: If the node already has the
node-role.kubernetes.io/<pool>label, it will already be included incurrentNodesand the expectation oflen(currentNodes) + 1machines will never be satisfied, causingWaitForMachineCountto time out. A check usingGetLabel()before labeling would avoid this unnecessary wait. -
Race with
WaitImmediateForUpdatedStatus: The immediate polling (immediate=true) meansWaitImmediateForUpdatedStatuscan observe the staleUpdated=Truestatus from a prior quiesced state, returning success before the MCP controller has actually begun reconciling the newly-added node. WhileWaitForMachineCounthelps bridge this, consider explicitly waiting forUpdating=Truefirst as a handshake before checking forUpdated=True, or useupdatedMachineCount == machineCountas a synchronization signal.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@test/extended-priv/machineconfigpool.go` around lines 1552 - 1577, In
AddNodeToPool, avoid the race where a pre-labeled node inflates currentNodes and
causes WaitForMachineCount to hang: first call
node.GetLabel("node-role.kubernetes.io/<pool>") (use mcp.GetName() to build the
key) and only call node.AddLabel(...) if the label is absent; after successfully
adding the label (or if you detect a fresh addition), wait for the MCP to see
the new machine with mcp.WaitForMachineCount(expectedCount, ...), then instead
of immediately calling WaitImmediateForUpdatedStatus, poll the MCP for
Updating==True (or assert updatedMachineCount == machineCount) as a handshake
before asserting Updated==True (so replace or precede the
WaitImmediateForUpdatedStatus call with a short WaitForUpdating/condition check
to ensure reconciliation started).
| exutil.By("Verify MCP is degraded with error about osImageURL and osImageStream conflict") | ||
| o.Eventually(customMcp, "5m", "20s").Should(HaveConditionField("RenderDegraded", "status", TrueString), | ||
| "MCP should be degraded when both osImageURL and osImageStream are set") | ||
|
|
||
| expectedErrorMsg := "cannot override MachineConfig osImageURL and set MachineConfigPool spec.osImageStream.name simultaneously" | ||
| o.Eventually(customMcp.Get, "2m", "10s").WithArguments(`{.status.conditions[?(@.type=="RenderDegraded")].message}`).Should( | ||
| o.ContainSubstring(expectedErrorMsg), | ||
| "MCP degraded message should mention osImageURL and osImageStream conflict") | ||
| logger.Infof("MCP correctly degraded with expected error message") | ||
| logger.Infof("OK!\n") | ||
| }) |
There was a problem hiding this comment.
Missing explicit recovery assertion after removing the conflicting MC.
The PR summary for OCP-88365 states this test "tests recovery after removing the conflict", but the body only asserts the degraded state and relies on the deferred mc.DeleteWithWait() / DeleteCustomMCP() for cleanup — it doesn't Eventually assert that the MCP transitions back to RenderDegraded=False and Updated=True after the conflicting MC is removed. Without that assertion, a regression where the pool stays wedged in degraded would be silently masked (cleanup would likely time out or log-only error in defer).
Consider adding an explicit "remove the MC and verify recovery" step before the defers kick in.
🛡️ Suggested additional step
logger.Infof("MCP correctly degraded with expected error message")
logger.Infof("OK!\n")
+
+ exutil.By("Remove the conflicting MachineConfig and verify MCP recovers")
+ mc.DeleteWithWait()
+ o.Eventually(customMcp, "10m", "20s").Should(HaveConditionField("RenderDegraded", "status", FalseString),
+ "MCP should recover from RenderDegraded after removing the conflicting MC")
+ o.Expect(customMcp.WaitForUpdatedStatus()).To(o.Succeed(), "MCP should become Updated after recovery")
+ logger.Infof("OK!\n")
})🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@test/extended-priv/mco_osimagestream.go` around lines 456 - 466, After
asserting the MCP is degraded, remove the conflicting MachineConfig (call the
same removal used in cleanup, e.g. invoke mc.DeleteWithWait() or
DeleteCustomMCP()) and then explicitly assert recovery by polling the customMcp:
use o.Eventually(customMcp, "5m",
"20s").Should(HaveConditionField("RenderDegraded", "status", FalseString)) and
also check Updated condition true (e.g. HaveConditionField("Updated", "status",
TrueString)) to ensure the MCP transitions back to healthy; place these
assertions after deleting the MC and before relying on deferred cleanup to catch
regressions.
| exutil.By(fmt.Sprintf("Create duplicate MachineSet %s from existing worker MachineSet", duplicateMsName)) | ||
| existingMsList := NewMachineSetList(oc.AsAdmin(), MachineAPINamespace).GetAllOrFail() | ||
| o.Expect(len(existingMsList)).To(o.BeNumerically(">", 0), "At least one MachineSet should exist") | ||
| duplicateMs, err = existingMsList[0].Duplicate(duplicateMsName) | ||
| o.Expect(err).NotTo(o.HaveOccurred(), "Error duplicating MachineSet") | ||
| defer func() { | ||
| logger.Infof("Cleaning up duplicate MachineSet %s", duplicateMsName) | ||
| duplicateMs.Delete() | ||
| }() | ||
| logger.Infof("OK!\n") |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Find how MachineSet role is inspected elsewhere
rg -nP --type=go -C2 '(cluster-api-machine-role|machine\.openshift\.io/cluster-api-machine-role)'
ast-grep --pattern $'func ($_ *MachineSet) $_() $$$'Repository: openshift/machine-config-operator
Length of output: 6496
🏁 Script executed:
#!/bin/bash
# Search for MachineSet.Get() usages to understand field path format
rg -nP --type=go 'ms\.Get\(' -A2 -B2
# Search for worker role references
rg -nP --type=go '(worker|cluster-api-machine-role)' --glob='*test*' | head -30Repository: openshift/machine-config-operator
Length of output: 11298
🏁 Script executed:
#!/bin/bash
# Find MachineSet.Duplicate() and Delete() methods
rg -nP --type=go 'func \(.*MachineSet.*\) (Duplicate|Delete)' -A3
# Search for existing worker filter patterns or helper functions
rg -nP --type=go 'NewMachineSetList|GetAllOrFail|ByLabel' -A2 -B2 | head -50Repository: openshift/machine-config-operator
Length of output: 3954
🏁 Script executed:
#!/bin/bash
# Find the Delete() method for MachineSet
rg -nP --type=go 'func \(.*MachineSet.*\) Delete' -A3
# Search for ByLabel method on MachineSetList
rg -nP --type=go 'func.*MachineSetList.*ByLabel' -A3
# Check CloneResource to understand Duplicate's implementation
rg -nP --type=go 'func CloneResource' -A10Repository: openshift/machine-config-operator
Length of output: 852
🏁 Script executed:
#!/bin/bash
# Find Delete method more broadly
rg -nP --type=go 'Delete.*\(\)' test/extended-priv/machineset.go -B2 -A5
# Also search in resource.go for Delete method
rg -nP --type=go 'func \(.*\) Delete' test/extended-priv/resource.go -B2 -A5
# Look for MachineSetList ByLabel
rg -nP --type=go '\.ByLabel' test/extended-priv/machineset.go -B2 -A2
# Check if Delete is called elsewhere in the codebase
rg -nP --type=go 'duplicateMs\.Delete\(\)|\.Delete\(\)' test/extended-priv/ -B1 -A1Repository: openshift/machine-config-operator
Length of output: 16087
🏁 Script executed:
#!/bin/bash
# Find ByLabel implementation on ResourceList or MachineSetList
rg -nP --type=go 'func.*ByLabel' test/extended-priv/ -A3
# Check what labels are used to identify worker MachineSets
rg -nP --type=go 'worker.*MachineSet|MachineSet.*worker' test/extended-priv/ -B2 -A2
# Look for role-based filtering in MachineSet selection
rg -nP --type=go 'cluster-api-machine-role' test/extended-priv/ -B2 -A2Repository: openshift/machine-config-operator
Length of output: 2623
🏁 Script executed:
#!/bin/bash
# Check if there are other instances of filtering worker MachineSets
rg -nP --type=go 'MachineSetList.*GetAll|NewMachineSetList' test/extended-priv/ -A5 | head -40
# Look for worker pool constants
rg -nP --type=go 'MachineConfigPoolWorker|MachineAPINamespace' test/extended-priv/const.go -B2 -A2Repository: openshift/machine-config-operator
Length of output: 3399
Filter MachineSet selection to worker role and handle the cleanup deletion error.
Two issues in this block:
-
existingMsList[0]picks the first MachineSet in API order, not necessarily a worker. Clusters with infra or other MachineSets will cause subsequent assertions tied to osstream labels to fail or produce misleading results. The comment itself says "from existing worker MachineSet"—filter explicitly using:existingMsList.ByLabel("machine.openshift.io/cluster-api-machine-role=worker")
-
The deferred
duplicateMs.Delete()silently discards errors. If deletion fails, the duplicate MachineSet and its provisioned machines leak into subsequent tests. Log the error:if derr := duplicateMs.Delete(); derr != nil { logger.Errorf("Failed to delete duplicate MachineSet %s: %v", duplicateMsName, derr) }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@test/extended-priv/mco_osimagestream.go` around lines 483 - 492, The code
selects the first MachineSet from existingMsList and doesn't handle delete
errors; update the selection to filter only worker MachineSets by calling
ByLabel("machine.openshift.io/cluster-api-machine-role=worker") on the
NewMachineSetList(oc.AsAdmin(), MachineAPINamespace) result before using
existingMsList[0], and change the deferred cleanup to call duplicateMs.Delete()
and log any returned error (e.g., check the error returned by
duplicateMs.Delete() and logger.Errorf with duplicateMsName and the error)
instead of discarding it.
| exutil.By("Verify controller no longer logs unsupported stream message for this MachineSet") | ||
| o.Eventually(controller.GetLogs, "2m", "10s").ShouldNot(o.ContainSubstring(unsupportedStreamLogMsg), | ||
| "Controller should not log unsupported stream message after changing to supported stream") | ||
| logger.Infof("Controller correctly stopped logging unsupported stream for %s", duplicateMsName) | ||
| logger.Infof("OK!\n") |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
Gomega Eventually ShouldNot vs Consistently ShouldNot semantics for asserting a condition never becomes true
💡 Result:
Gomega's Eventually.ShouldNot and Consistently.ShouldNot have distinct semantics for asserting a condition never becomes true. Eventually(func Condition).ShouldNot(Matcher) polls until the matcher FAILS (i.e., Condition does NOT match Matcher) or timeout. It succeeds if the matcher never succeeds within the timeout period. However, if the condition briefly matches the matcher at any poll, it succeeds immediately. This is rarely useful for "never becomes true" because any transient match causes success; it's more for expecting the negation to eventually hold stably. Consistently(func Condition).ShouldNot(Matcher) polls for the full duration, succeeding only if the matcher FAILS at EVERY poll (i.e., condition never matches). It fails immediately if the matcher ever succeeds during polling. This is the idiomatic way to assert a condition "never becomes true" over a time window. Example from official docs for Consistently.ShouldNot: Consistently(channel, "200ms").ShouldNot(Receive) // asserts nothing received for 200ms For Eventually.ShouldNot, no direct "never" example in docs; its use is niche since it succeeds on first non-match but requires non-match to persist until timeout without ever matching. To assert "never becomes true" indefinitely, use longer durations with Consistently or combine with other patterns like separate Eventually(finished) then Expect(successful).
Citations:
- 1: https://onsi.github.io/gomega/
- 2: https://github.com/onsi/gomega/blob/master/gomega_dsl.go
- 3: https://onsi.github.io/gomega
- 4: https://github.com/onsi/gomega/blob/3eef0d7813628f9e4bcfb93e6dd3e8cd12342cd6/gomega_dsl.go
🏁 Script executed:
cd test/extended-priv && wc -l mco_osimagestream.goRepository: openshift/machine-config-operator
Length of output: 102
🏁 Script executed:
cd test/extended-priv && sed -n '515,545p' mco_osimagestream.goRepository: openshift/machine-config-operator
Length of output: 1659
Replace Eventually(...).ShouldNot(...) with Consistently(...).ShouldNot(...) for log-absence verification.
After controller.IgnoreLogsBeforeNowOrFail() clears logs, GetLogs() returns an empty buffer that trivially does not contain the target substring. Eventually(...).ShouldNot(...) succeeds immediately on the first poll and never verifies the log stays absent over the full 2-minute window. A regression where the controller emits the unsupported-stream log after switching to a supported stream would not be caught.
Consistently(...).ShouldNot(...) enforces the matcher fails at every poll throughout the entire duration, which is the correct assertion for "this message must never appear."
Proposed fix
exutil.By("Verify controller no longer logs unsupported stream message for this MachineSet")
- o.Eventually(controller.GetLogs, "2m", "10s").ShouldNot(o.ContainSubstring(unsupportedStreamLogMsg),
+ o.Consistently(controller.GetLogs, "2m", "10s").ShouldNot(o.ContainSubstring(unsupportedStreamLogMsg),
"Controller should not log unsupported stream message after changing to supported stream") exutil.By("Verify boot image controller does not log unsupported stream for CPMS")
controller.IgnoreLogsBeforeNowOrFail()
- o.Eventually(controller.GetLogs, "2m", "10s").ShouldNot(o.ContainSubstring("has unsupported stream"),
+ o.Consistently(controller.GetLogs, "2m", "10s").ShouldNot(o.ContainSubstring("has unsupported stream"),
"Boot image controller should not log unsupported stream message for CPMS")📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| exutil.By("Verify controller no longer logs unsupported stream message for this MachineSet") | |
| o.Eventually(controller.GetLogs, "2m", "10s").ShouldNot(o.ContainSubstring(unsupportedStreamLogMsg), | |
| "Controller should not log unsupported stream message after changing to supported stream") | |
| logger.Infof("Controller correctly stopped logging unsupported stream for %s", duplicateMsName) | |
| logger.Infof("OK!\n") | |
| exutil.By("Verify controller no longer logs unsupported stream message for this MachineSet") | |
| o.Consistently(controller.GetLogs, "2m", "10s").ShouldNot(o.ContainSubstring(unsupportedStreamLogMsg), | |
| "Controller should not log unsupported stream message after changing to supported stream") | |
| logger.Infof("Controller correctly stopped logging unsupported stream for %s", duplicateMsName) | |
| logger.Infof("OK!\n") |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@test/extended-priv/mco_osimagestream.go` around lines 517 - 521, The test
currently uses Eventually(controller.GetLogs, "2m",
"10s").ShouldNot(o.ContainSubstring(unsupportedStreamLogMsg)) which can pass
immediately because controller.IgnoreLogsBeforeNowOrFail() clears logs; change
this to Consistently(controller.GetLogs, "2m",
"10s").ShouldNot(o.ContainSubstring(unsupportedStreamLogMsg)) so the assertion
verifies the unsupportedStreamLogMsg never appears for the entire duration;
update the call site where controller.GetLogs and unsupportedStreamLogMsg are
referenced and ensure the same polling interval arguments are used with
Consistently instead of Eventually.
|
@ptalgulk01: The following test failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
| } | ||
|
|
||
| // AddNodeToPool adds a node to the MCP and waits for the MCP to recognize it | ||
| func (mcp *MachineConfigPool) AddNodeToPool(node *Node) error { |
There was a problem hiding this comment.
We can't add nodes to a master pool, for example. So we can't define this function as a general method in a MCP.
It would be better if we write it as an independent function accepting a MCP as parameter, and not as a method in the MCP struct.
This code is very similar to the one in CreateCustomMCPWithStreamByNodes
We can make the function to add several nodes, not only one, to the poo and reuse it in CreateCustomMCPWithStreamByNodes avoiding duplicated code.
| g.Skip("The cluster is SNO/Compact. This test cannot be executed in SNO/Compact clusters") | ||
| } |
There was a problem hiding this comment.
Let's use the SkipIfCompactOrSNO function that is used in the private repo. It seems it hasn't been migrated yet, but it should be migrated anyway.
// SkipIfCompactOrSNO skips the test case if the cluster is a compact or SNO cluster
func SkipIfCompactOrSNO(oc *exutil.CLI) {
if IsCompactOrSNOCluster(oc) {
g.Skip("The test is not supported in Compact or SNO clusters")
}
}
| if IsCompactOrSNOCluster(oc.AsAdmin()) { | ||
| g.Skip("The cluster is SNO/Compact. This test cannot be executed in SNO/Compact clusters") | ||
| } |
There was a problem hiding this comment.
Same, let's migrate and use the SkipIfCompactOrSNO function
| o.Eventually(func() string { | ||
| currentMOSB, err := mosc.GetCurrentMachineOSBuild() | ||
| if err != nil { | ||
| return "" | ||
| } | ||
| return currentMOSB.GetName() | ||
| }, "5m", "20s").ShouldNot(o.Equal(initialMOSBName), "A new MachineOSBuild should be triggered after osImageStream change") |
There was a problem hiding this comment.
It is cleaner if we don't swallow the error silently
o.Eventually(func() (string, error) {
mosb, err := mosc.GetCurrentMachineOSBuild()
if err != nil {
return "", err
}
return mosb.GetName(), nil
}, "5m", "20s").ShouldNot(o.Equal(initialMOSBName),
"A new MachineOSBuild should be triggered after osImageStream change")
| if IsCompactOrSNOCluster(oc.AsAdmin()) { | ||
| g.Skip("The cluster is SNO/Compact. This test cannot be executed in SNO/Compact clusters") | ||
| } |
There was a problem hiding this comment.
Same let's use the SkipIfCompactOrSNO function
| logger.Infof("OK!\n") | ||
|
|
||
| exutil.By(fmt.Sprintf("Set osstream label to %s on MachineSet", OSImageStreamRHEL10)) | ||
| err = duplicateMs.Patch("merge", fmt.Sprintf(`{"metadata":{"labels":{"machineconfiguration.openshift.io/osstream":"%s"}}}`, OSImageStreamRHEL10)) |
There was a problem hiding this comment.
We need a method for this, so that it can be reused in future tests.
We can use AdLabel instead. But I think it would be better to use even a more specific method so that we don't have to care about the name of the label.
duplicateMs.AddOSStreamLabel(...)
Then this function can call "AddLabel".
| // Reset log checkpoint before changing to supported stream to verify the new behavior | ||
| exutil.By("Change osstream label to rhel-9 and verify no unsupported stream log is triggered") | ||
| controller.IgnoreLogsBeforeNowOrFail() | ||
| err = duplicateMs.Patch("merge", fmt.Sprintf(`{"metadata":{"labels":{"machineconfiguration.openshift.io/osstream":"%s"}}}`, OSImageStreamRHEL9)) |
| exutil.By("Enable CPMS for managed boot images") | ||
| cpms = NewControlPlaneMachineSet(oc.AsAdmin(), MachineAPINamespace, ControlPlaneMachineSetName) | ||
| o.Expect(cpms.Exists()).To(o.BeTrue(), "ControlPlaneMachineSet should exist for this test") | ||
|
|
||
| machineConfig = GetMachineConfiguration(oc.AsAdmin()) | ||
| defer machineConfig.RemoveManagedBootImagesConfig() | ||
|
|
||
| o.Expect( | ||
| machineConfig.SetAllManagedBootImagesConfig(ControlPlaneMachineSetResource), | ||
| ).To(o.Succeed(), "Error enabling CPMS for managed boot images") | ||
| logger.Infof("OK!\n") | ||
|
|
There was a problem hiding this comment.
CPMS and Machinesets are two different functionalities.
We should use a different test case for the CPMS
| err = NewMCOTemplate(oc.AsAdmin(), "custom-machine-config-pool.yaml").Create( | ||
| "-p", fmt.Sprintf("NAME=%s", infraName), | ||
| ) |
|
|
||
| exutil.By("Create custom MCP with rhel-9 configured") | ||
| defer DeleteCustomMCP(oc.AsAdmin(), mcp1Name) | ||
| mcp1, err := CreateCustomMCP(oc.AsAdmin(), mcp1Name, 1) |
There was a problem hiding this comment.
CreateCustomMCPWithStream(oc.AsAdmin(), mcp2Name, OSImageStreamRHEL9, 1)
Test Cases Added
OCP-88122: Validate osImageStream inheritance for custom MCPs
Validates dynamic inheritance behavior in custom MachineConfigPools:
OCP-88708: Verify osstream logs for boot image controller
Validates boot image controller logging for unsupported osImageStream values:
OCP-88203: Verify MOSB triggered when osImageStream is patched
Validates MachineOSBuild resource creation when osImageStream changes:
OCP-88365: Verify MCP degraded when osstream and osImageURL MC applied
Validates error handling for conflicting osstream and osImageURL configurations:
Implementation Details
custom-machine-config-pool.yamltemplate for MCPs without osImageStream field (dynamic inheritance)AddNodeToPool()helper method to simplify node addition to custom poolsNewController().GetLogs()for boot image controller log verificationSummary by CodeRabbit