OCPNODE-4383: prom rules: add alert for nodes using runc#5874
OCPNODE-4383: prom rules: add alert for nodes using runc#5874haircommander wants to merge 2 commits intoopenshift:mainfrom
Conversation
|
Pipeline controller notification For optional jobs, comment This repository is configured in: LGTM mode |
|
@haircommander: This pull request references OCPNODE-4446 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "5.0.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository: openshift/coderabbit/.coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: 📒 Files selected for processing (4)
✅ Files skipped from review due to trivial changes (1)
🚧 Files skipped from review as they are similar to previous changes (3)
WalkthroughAdded a new Prometheus alert group Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes 🚥 Pre-merge checks | ✅ 12✅ Passed checks (12 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: haircommander The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
7e20fb5 to
83a3aea
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@install/0000_90_machine-config_01_prometheus-rules.yaml`:
- Around line 29-30: Fix the grammar typo in the Prometheus alert rule
description: update the description field value (the YAML key "description" in
the rule) to change “support will removed” to “support will be removed” so the
user-facing message reads correctly.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: 199b41d8-1f05-4044-a1a5-3bacba2e627a
📒 Files selected for processing (1)
install/0000_90_machine-config_01_prometheus-rules.yaml
|
@haircommander: This pull request references OCPNODE-4382 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the epic to target the "5.0.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@haircommander: This pull request references OCPNODE-4383 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "5.0.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
Caution Failed to replace (edit) comment. This is likely due to insufficient permissions or the comment being deleted. Error details |
f3b80b8 to
ee72fc1
Compare
|
/retest |
1 similar comment
|
/retest |
relies on a cri-o metric that's emitted when the cluster is using runc Signed-off-by: Peter Hunt <pehunt@redhat.com>
Signed-off-by: Peter Hunt <pehunt@redhat.com>
ee72fc1 to
6c48b96
Compare
|
@haircommander: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
yuqi-zhang
left a comment
There was a problem hiding this comment.
Overall seems fine, just some questions in line
| The runc OCI runtime has been deprecated and support will be removed in a future release. Migrate to crun before upgrading to a future OpenShift release. | ||
| See https://docs.redhat.com/en/documentation/openshift_container_platform/latest/html/machine_configuration/machine-configs-custom#config-container-runtime_machine-configs-custom | ||
| for migration steps. | ||
| runbook_url: https://github.com/openshift/runbooks/blob/master/alerts/machine-config-operator/RuncDeprecated.md |
There was a problem hiding this comment.
Is this still in progress? I don't see the file or a PR yet.
There was a problem hiding this comment.
| annotations: | ||
| summary: "This cluster is using the deprecated runc container runtime" | ||
| description: >- | ||
| The runc OCI runtime has been deprecated and support will be removed in a future release. Migrate to crun before upgrading to a future OpenShift release. |
There was a problem hiding this comment.
Sounds like this should be an upgradeable=false condition instead?
There was a problem hiding this comment.
we won't be blocking the upgrade, especially 4.21->4.22. in the 5.0 cycle we will block RCHOS upgrade from 9 to 10 if runc is in use. this is just the warning
There was a problem hiding this comment.
Well, this is merging into 5.0 branch right now. Are you looking to backport this?
I'm fine with doing the blocking edge as a followup but https://redhat.atlassian.net/browse/OCPNODE-4446 targets 4.22, so maybe it'd be better as a bug?
(also I'm not sure if we have a conditional block for 9->10 specifically yet, since the expectation is that you'd be switching at the pool level whenever you want, and not as part of an OCP upgrade, so we'd have to come up with a new mechanism there)
| rules: | ||
| - alert: RuncDeprecated | ||
| expr: | | ||
| count(container_runtime_crio_default_runtime{runtime="runc"}) > 0 |
There was a problem hiding this comment.
I guess there's no scenario to have different default runtimes on different nodes (can you even do that?) so this is probably a fine way to evaluate it.
There was a problem hiding this comment.
there could be (different MCPs can have different ones) but it's reported once on each node, and all we care about is no node has runc as default
There was a problem hiding this comment.
hmm, ok, I thought you had to have it homogeneous across the cluster, I guess now?
relies on a cri-o metric that's emitted when the cluster is using runc
generated with claude
Summary by CodeRabbit