Skip to content

Remove Kepler power monitoring agent#916

Open
jlarriba wants to merge 1 commit into
openstack-k8s-operators:mainfrom
jlarriba:remove-kepler-agent
Open

Remove Kepler power monitoring agent#916
jlarriba wants to merge 1 commit into
openstack-k8s-operators:mainfrom
jlarriba:remove-kepler-agent

Conversation

@jlarriba

Copy link
Copy Markdown
Collaborator

Kepler does not work in our deployment. Remove all Kepler code, configuration, dashboards, and scrape configs. Power monitoring now consists exclusively of ceilometer-ipmi. Also simplifies createComputeScrapeConfig by removing the suppressTLS workaround that was only needed for Kepler.

@openshift-ci openshift-ci Bot requested review from abays and vyzigold June 17, 2026 12:19
@openshift-ci

openshift-ci Bot commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jlarriba

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@centosinfra-prod-github-app

Copy link
Copy Markdown

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://gateway-cloud-softwarefactory.apps.ocp.cloud.ci.centos.org/zuul/t/rdoproject.org/buildset/4a6596cd79904a48875213c0e5ad5de9

telemetry-openstack-meta-content-provider-master FAILURE in 34m 03s
⚠️ telemetry-operator-multinode-cloudkitty SKIPPED Skipped due to failed job telemetry-openstack-meta-content-provider-master
✔️ openstack-k8s-operators-content-provider SUCCESS in 1h 53m 35s
✔️ telemetry-operator-multinode-default-telemetry SUCCESS in 1h 32m 59s
telemetry-operator-multinode-audit-logging FAILURE in 1h 11m 04s
⚠️ functional-tests-osp18 SKIPPED Skipped due to failed job telemetry-openstack-meta-content-provider-master

@vyzigold

Copy link
Copy Markdown
Contributor

What about update. e.g. from FR5 to FR6? Shouldn't we have some sort of cleanup to remove the kepler scrapeconfig and dashboard?

@jlarriba

Copy link
Copy Markdown
Collaborator Author

recheck

@jlarriba

Copy link
Copy Markdown
Collaborator Author

What about update. e.g. from FR5 to FR6? Shouldn't we have some sort of cleanup to remove the kepler scrapeconfig and dashboard?

Yeah, thats a good catch.

I was thinking that the upgraded telemetry-operator will run a reconcile and during that it will automagically remove the created resources as they do not exist anymore in its vision, but you make me think that, as it does not know about them, it might just ignore them.

@jlarriba jlarriba force-pushed the remove-kepler-agent branch from 9bc9eae to a270b1a Compare June 18, 2026 11:03
Kepler does not work in our deployment. Remove all Kepler code,
configuration, dashboards, and scrape configs. Power monitoring
now consists exclusively of ceilometer-ipmi. Also simplifies
createComputeScrapeConfig by removing the suppressTLS workaround
that was only needed for Kepler. Adds cleanup in reconcileUpdate
to remove leftover Kepler resources from existing environments.
@jlarriba jlarriba force-pushed the remove-kepler-agent branch from a270b1a to 64a89e0 Compare June 18, 2026 11:15
@centosinfra-prod-github-app

Copy link
Copy Markdown

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://gateway-cloud-softwarefactory.apps.ocp.cloud.ci.centos.org/zuul/t/rdoproject.org/buildset/b56ff352284f48a59c124bd52cccb9fd

✔️ telemetry-openstack-meta-content-provider-master SUCCESS in 3h 12m 56s
✔️ telemetry-operator-multinode-cloudkitty SUCCESS in 1h 32m 22s
✔️ openstack-k8s-operators-content-provider SUCCESS in 1h 51m 36s
✔️ telemetry-operator-multinode-default-telemetry SUCCESS in 1h 34m 28s
✔️ telemetry-operator-multinode-audit-logging SUCCESS in 1h 10m 24s
functional-tests-osp18 FAILURE in 2h 29m 17s

@jlarriba

Copy link
Copy Markdown
Collaborator Author

/retest

@jlarriba

Copy link
Copy Markdown
Collaborator Author

recheck

@openshift-ci

openshift-ci Bot commented Jun 19, 2026

Copy link
Copy Markdown
Contributor

@jlarriba: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/telemetry-operator-build-deploy 64a89e0 link false /test telemetry-operator-build-deploy

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@centosinfra-prod-github-app

Copy link
Copy Markdown

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://gateway-cloud-softwarefactory.apps.ocp.cloud.ci.centos.org/zuul/t/rdoproject.org/buildset/b73a0e74939c4b5ead3efb75638429bc

✔️ telemetry-openstack-meta-content-provider-master SUCCESS in 3h 26m 32s
✔️ telemetry-operator-multinode-cloudkitty SUCCESS in 1h 39m 43s
✔️ openstack-k8s-operators-content-provider SUCCESS in 2h 01m 12s
✔️ telemetry-operator-multinode-default-telemetry SUCCESS in 1h 34m 44s
✔️ telemetry-operator-multinode-audit-logging SUCCESS in 1h 23m 51s
functional-tests-osp18 FAILURE in 2h 40m 06s

@elfiesmelfie

Copy link
Copy Markdown
Contributor

github-check is failing because the feature-verification-tests still reference kepler.
The failures indicate that kepler is not available as expected [1]

You need to

  • Create a PR in infrawatch/feature-verification-tests to remove the keplar test configuration.
  • Use a Depends-On: <FVT PR> in the description of this PR to fix the tests
  • Merge the FVT PR
  • Merge this PR

[1] https://gateway-cloud-softwarefactory.apps.ocp.cloud.ci.centos.org/logs//47a/rdoproject.org/47a473d6ade04ea98a2e29481196e748/controller/ci-framework-data/tests/feature-verification-tests/run_functional_tests-1781867726.1988885.xml

@vyzigold

Copy link
Copy Markdown
Contributor

Also, be aware (after my recent blunder with Depends-On). On telemetry-operator we use tide to merge PRs. Tide ignores Depends-On. So Tide would be happy to merge this PR once the CI passes even if the Depends-On FVT PR isn't merged yet, which would then block all other telemetry-operator PRs until the FVT PR merges as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants