Skip to content

Fix CNPG cluster spread alert#1133

Open
TheBigLee wants to merge 1 commit intodevelopfrom
fix/cnpg_cluster_zone_spread
Open

Fix CNPG cluster spread alert#1133
TheBigLee wants to merge 1 commit intodevelopfrom
fix/cnpg_cluster_zone_spread

Conversation

@TheBigLee
Copy link
Copy Markdown
Member

@TheBigLee TheBigLee commented Apr 7, 2026

The alert considers every pod starting with postgresql-. This includes all the backup pods as well. Since the backup pods are not cleared up immediately, this alert fires all the time.
Restricting the alert to only the cluster pods, solves the issue.

Checklist

  • The PR has a meaningful title. It will be used to auto generate the
    changelog.
    The PR has a meaningful description that sums up the change. It will be
    linked in the changelog.
  • PR contains a single logical change (to build a better changelog).
  • I run make e2e-test against local kindev and all checks passed
  • If no changes have been made you can self approve but otherwise at least two reviews are required
  • Link this PR to related issues or PRs.

@TheBigLee TheBigLee requested review from a team, Kidswiss, mdnix, mikeshootzz and zugao and removed request for a team April 7, 2026 13:56
@TheBigLee TheBigLee added the bug Something isn't working label Apr 7, 2026
@TheBigLee TheBigLee force-pushed the fix/cnpg_cluster_zone_spread branch from 9b6e9bc to 851336f Compare April 7, 2026 14:01
@mikeshootzz
Copy link
Copy Markdown
Contributor

@TheBigLee does this actually work? CNPG does not use an STS since the pods are directly managed by the operator

@Kidswiss
Copy link
Copy Markdown
Contributor

Kidswiss commented Apr 7, 2026

Guess I missed one alert here: https://github.com/vshn/component-appcat/pull/1101/changes

Also, as @mikeshootzz your solution might not work as CNPG doesn't use STS. My fix actually looks at the labels and filters via them.

@TheBigLee TheBigLee force-pushed the fix/cnpg_cluster_zone_spread branch from 851336f to 687fc9d Compare April 7, 2026 14:10
@TheBigLee
Copy link
Copy Markdown
Member Author

@TheBigLee does this actually work? CNPG does not use an STS since the pods are directly managed by the operator

You are right. The correct label is not StatefulSet but Cluster

@TheBigLee
Copy link
Copy Markdown
Member Author

Guess I missed one alert here: https://github.com/vshn/component-appcat/pull/1101/changes

Also, as @mikeshootzz your solution might not work as CNPG doesn't use STS. My fix actually looks at the labels and filters via them.

Hm. Not sure If I should use your solution or stick with mine. WDYT?

@TheBigLee TheBigLee force-pushed the fix/cnpg_cluster_zone_spread branch from 687fc9d to a425745 Compare April 7, 2026 14:17
@mikeshootzz
Copy link
Copy Markdown
Contributor

Guess I missed one alert here: https://github.com/vshn/component-appcat/pull/1101/changes
Also, as @mikeshootzz your solution might not work as CNPG doesn't use STS. My fix actually looks at the labels and filters via them.

Hm. Not sure If I should use your solution or stick with mine. WDYT?

For consistency’s sake I would stick with the solution we already have

@Kidswiss
Copy link
Copy Markdown
Contributor

Kidswiss commented Apr 7, 2026

Guess I missed one alert here: https://github.com/vshn/component-appcat/pull/1101/changes
Also, as @mikeshootzz your solution might not work as CNPG doesn't use STS. My fix actually looks at the labels and filters via them.

Hm. Not sure If I should use your solution or stick with mine. WDYT?

I'd use the same solution. Because otherwise it gets more complicated.

The alert considers every pod starting with `postgresql-`. This includes
all the backup pods as well. Since the backup pods are not cleared up
immediately, this alert fires all the time.
Restricting the alert to only the cluster pods, solves the issue.

Signed-off-by: Nicolas Bigler <nicolas.bigler@vshn.ch>
@TheBigLee TheBigLee force-pushed the fix/cnpg_cluster_zone_spread branch from a425745 to c05c9cf Compare April 7, 2026 14:23
@TheBigLee
Copy link
Copy Markdown
Member Author

Guess I missed one alert here: https://github.com/vshn/component-appcat/pull/1101/changes
Also, as @mikeshootzz your solution might not work as CNPG doesn't use STS. My fix actually looks at the labels and filters via them.

Hm. Not sure If I should use your solution or stick with mine. WDYT?

I'd use the same solution. Because otherwise it gets more complicated.

Ok. Fixed to change to use the same as approach as you did earlier

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants