Skip to content

feat!: common chart v2.0.0#242

Open
Glenn-Terjesen wants to merge 57 commits intomainfrom
v2
Open

feat!: common chart v2.0.0#242
Glenn-Terjesen wants to merge 57 commits intomainfrom
v2

Conversation

@Glenn-Terjesen
Copy link
Copy Markdown
Contributor

@Glenn-Terjesen Glenn-Terjesen commented Mar 25, 2026

Common Chart v2.0.0

Upgrade guide

See UPGRADE.md for the full migration guide.

If you use an AI coding agent (Claude Code, Copilot, Cursor, etc.), you can paste the following prompt to have it perform the migration for you:

Upgrade the Entur common Helm chart dependency from v1 to v2.

First, read the migration guide:
  https://raw.githubusercontent.com/entur/helm-charts/main/UPGRADE.md

Then apply the "Quick Migration Checklist" to all values files in this repository.
Run `helm dependency update` and `helm lint` to verify.

Breaking Changes

v1 v2 Notes
shortname appId Matches GoogleCloudApplication metadata.id
container.replicas deployment.minReplicas HPA controls pod count, Helm never resets it
deployment.replicas deployment.minReplicas Same — renamed for clarity
container.maxReplicas deployment.maxReplicas Moved to deployment
container.forceReplicas deployment.forceReplicas Moved to deployment
container.minAvailable deployment.minAvailable Moved to deployment
container.memoryLimit removed Memory limit = memory request always
pdb.minAvailable deployment.minAvailable Single place to configure
postgres.connectionConfig postgres.instances External Secrets from Secret Manager
postgres.memoryLimit removed Use postgres.memory
kubernetes.io/ingress.class ingress.ingressClassName K8s standard field

What's New

HPA always enabled — HPA runs in all environments (not just prd). Default minReplicas: 2 everywhere. Deployment spec never emits replicas, so helm upgrade can't reset HPA-managed pod counts. Use forceReplicas to opt out.

PDB fixesunhealthyPodEvictionPolicy: AlwaysAllow prevents unhealthy pods from blocking cluster upgrades. forceReplicas > 1 now correctly gets PDB protection.

Cloud SQL Proxy v2 — Upgraded to v2 (2.21.2). Connection config sourced via External Secrets from Secret Manager (postgres.instances). Supports multiple databases. Prometheus metrics on port 9801.

GKE Startup CPU Boost — Alfa version - Optional (deployment.startupCPUBoost.enabled). Temporarily increases CPU during startup, reverts when pod is Ready. Auto-sets CPU limit to 1.3x request. NB Not ready yet!

Native gRPC probesgrpc: true now uses K8s native gRPC probes with service.internalPort. No need for /bin/grpc_health_probe binary or manual port config.

Custom HPA metricshpa.metrics list for Pods/External/Object metrics alongside default CPU. ScaleUp stabilization window (120s default) when CPU boost is disabled.

JSON Schema validationvalues.schema.json catches typos and unknown properties on helm lint. IDE autocompletion in VS Code and JetBrains.

Helm 3 + 4 — CI tests all run against both Helm v3.20.0 and v4.1.3.

Other Improvements

  • Startup probe supports path for httpGet (Allow path for startup probe #237)
  • deployment.cpuUtilization (default 70%) replaces top-level cpuUtilization (HPA averageUtilization should be placed under deployment #221)
  • Per-ingress annotations and ingressClassName support
  • Cron: added seccompProfile, fixed postgres proxy placement
  • Rolling update defaults: maxSurge: 1, maxUnavailable: 1
  • Memory limit always equals memory request (no more 1.2x multiplier)
  • Fixed memoryLimt typo in postgres proxy helper
  • Removed dead grpcexecprobes helper

CI

  • Unit tests + install tests + example validation on both Helm 3 and Helm 4
  • Examples validated against local chart (not published repo)
  • helm lint added to example validation
  • kube-startup-cpu-boost operator installed in kind cluster

Closes

Closes #101, closes #126, closes #195, closes #221, closes #225, closes #235, closes #237

Glenn-Terjesen and others added 16 commits March 25, 2026 11:49
…s eviction of unhealthy pods

- Add unhealthyPodEvictionPolicy: AlwaysAllow to prevent unhealthy pods from blocking node drains during cluster upgrades
- Fix forceReplicas > 1 getting minAvailable 0% (all pods evictable)
- Fix replicas=1 with HPA getting minAvailable 0% despite 2+ pods running
- PDB logic now checks effective replicas: forceReplicas, HPA min replicas, or configured replicas

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…nnotations

- Add missing seccompProfile to cron pod securityContext (matches deployment)
- Move postgres proxy outside container loop in cron to prevent duplicate sidecars
- Support per-ingress annotations when using ingresses list

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…l Secrets

- Upgrade cloud-sql-proxy from v1 (1.33.16) to v2 (2.21.2)
- Update image, executable name, and CLI flags for v2
- Fix memoryLimt typo in helpers, deprecate memoryLimit (limit = request)
- Replace configmap-based connection config with External Secrets
- Add postgres.instances to configure Secret Manager keys for instance connection names
- Support multiple SQL databases via indexed CSQL_PROXY_INSTANCE_CONNECTION_NAME_N env vars
- Deprecate postgres.connectionConfig with fail message
- Fix deployment.prometheus.path not falling back to default (#225)

BREAKING CHANGE: postgres.connectionConfig is removed in favor of postgres.instances.
Users must migrate their connection config from Kubernetes ConfigMaps to
Secret Manager keys via External Secrets (e.g. postgres.instances: [PGINSTANCES]).
Add deployment.cpuUtilization as the preferred location for HPA CPU
target utilization. Falls back to top-level cpuUtilization for
backwards compatibility, then to default 100%.
- Add StartupCPUBoost CRD resource (enabled by default, 50% increase)
- Boost targets pods by app label, reverts when pod becomes Ready
- Lower default HPA cpuUtilization from 100% to 70% (best practice
  when startup CPU spikes are handled by the boost operator)
- Requires kube-startup-cpu-boost operator installed in the cluster
When container.probes.startup.path is set, the startup probe uses
httpGet instead of tcpSocket. This enables custom startup health
checks for apps with long-running startup tasks like cache warming.
Setting grpc: true now uses native K8s gRPC probes with internalPort
by default. No need to manually set probe ports for each probe.
Removes fallback to exec-based grpc_health_probe which required the
binary in the container image.
…ssName (#126)

Replace kubernetes.io/ingress.class annotation (deprecated since K8s
1.18) with spec.ingressClassName. Defaults to "traefik", configurable
via ingress.ingressClassName or per-ingress ingressClassName field.
Add appId field matching GoogleCloudApplication metadata.id. Falls
back to shortname for backwards compatibility. Adds new "appId" label
to all resources alongside existing "shortname" label.
Add hpa.metrics list for appending custom metrics (Pods, External,
Object) alongside the default CPU utilization metric. Supports
Prometheus/GMP gauges, Pub/Sub queue depth, and any Cloud Monitoring
metric. The existing hpa.spec override for full control is preserved.
- Remove unused grpcexecprobes helper (native K8s gRPC probes are now default)
- Remove shortname value entirely — appId is now required with fail message for migration
- Remove top-level cpuUtilization fallback, use only deployment.cpuUtilization
- Fix readiness probe comment typo (said "liveness")
- Rename shortname to appId in all fixtures and test values

BREAKING CHANGE: shortname is removed. Use appId instead.
…theus metrics

- Remove container.replicas, container.maxReplicas, container.forceReplicas,
  container.minAvailable, container.terminationGracePeriodSeconds — these are
  now only under deployment.* where they belong
- Remove container.* fallbacks from deployment.yaml, hpa.yaml, pdb.yaml
- Enable prometheus metrics on Cloud SQL proxy v2 (--http-port=9801 --prometheus)
  exposing metrics at :9801/metrics for monitoring proxy health and connections
- Update all tests and fixtures to use deployment.* for scaling fields

BREAKING CHANGE: container.replicas, container.maxReplicas, container.forceReplicas,
container.minAvailable, and container.terminationGracePeriodSeconds are removed.
Use deployment.replicas, deployment.maxReplicas, etc. instead.
When postgres is enabled, adds prometheus.io/scrape-sql-proxy,
prometheus.io/sql-proxy-port (9801), and prometheus.io/sql-proxy-path
(/metrics) annotations to pods. Allows configuring Prometheus to scrape
the SQL proxy sidecar alongside the main application.
@Glenn-Terjesen Glenn-Terjesen requested a review from a team as a code owner March 25, 2026 13:26
Tests were still asserting livenessProbe.exec.command from the removed
grpcexecprobes helper. Updated to assert livenessProbe.grpc which is
the native K8s gRPC probe now used by default.
Add the StartupCPUBoost Helm chart to the CI kind cluster setup so
the StartupCPUBoost CRD is available during helm install tests.
…repo

Copy the local common chart into each example's charts/ directory
before running helm template. This tests examples against the current
branch's chart instead of the published version.
…-backend

- Add values-kub-ent-tst.yaml and values-kub-ent-prd.yaml for grpc-app
  example so CI can validate all environments
- Add postgres.instances to typical-backend example (required by v2 proxy)
Add values.schema.json matching v2 values structure. Validates values
on helm install/upgrade/template/lint to catch typos and unknown
properties early. Updated from PR #222 for v2 changes: appId instead
of shortname, scaling fields under deployment only, postgres.instances,
hpa.metrics, startupCPUBoost, ingressClassName, startup probe path.
Run unit tests, kind cluster install tests, and example validation
against both Helm 3 and Helm 4 using matrix strategy.
…isabled

When CPU boost is off, Java startup CPU spikes can trigger unnecessary
HPA scale-ups. A 120s stabilization window gives pods time to finish
startup before HPA acts on the elevated CPU. When CPU boost is enabled
the window is not needed since startup spikes are handled by the boost.
…ionWindowSeconds

Defaults to 120s when startupCPUBoost is disabled. Tune to match your
application's typical startup time (e.g. 60s for a fast app, 300s for
a heavy Spring Boot app with cache warming).
When CPU boost is enabled and no explicit cpuLimit is set, the CPU
limit is automatically set to 130% of the CPU request. This gives
the boost operator a ceiling to work within. Explicit cpuLimit
always takes precedence.
Memory limit is now always equal to memory request. The previous 1.2x
multiplier and memoryLimit override are removed. container.memoryLimit
is deprecated with a note to use container.memory instead.

BREAKING CHANGE: container.memoryLimit is removed. Memory limit now
always equals memory request. Set container.memory to the value you need.
HPA is now always enabled (unless forceReplicas is set). The Deployment
spec never emits replicas — HPA controls pod count in all environments.

Default minReplicas by environment:
- sbx/dev/tst: 1 (scales down to single pod in low traffic)
- prd: 2 (HA by default)

deployment.replicas overrides the default minReplicas for any env.
PDB protection follows minReplicas: 0% when minReplicas=1, 50% when >=2.

This prevents the v1 bug where helm upgrade would reset HPA-managed
replica counts back to the configured value.

BREAKING CHANGE: HPA is now enabled by default in all environments.
Use deployment.forceReplicas to opt out of HPA.
deployment.replicas is removed. Use deployment.minReplicas to set the
HPA minimum replica count, or deployment.forceReplicas to disable HPA.

Since HPA is always enabled, the Deployment spec never emits replicas
— HPA controls the pod count. This prevents helm upgrade from resetting
HPA-managed replica counts.

BREAKING CHANGE: deployment.replicas is removed. Use deployment.minReplicas
(sets HPA minimum) or deployment.forceReplicas (disables HPA).
…db.minAvailable

- Default minReplicas to 2 in all environments (no more env-aware branching)
- Remove pdb.minAvailable — use deployment.minAvailable instead (one place to configure)
- Change maxSurge and maxUnavailable defaults from 25% to 1 (works correctly with 2 replicas)
- PDB automatically 50% when minReplicas >= 2, 0% when minReplicas = 1

BREAKING CHANGE: pdb.minAvailable is removed. Use deployment.minAvailable instead.
Default minReplicas is now 2 in all environments (was 1 in dev/tst).
Default maxSurge and maxUnavailable changed from 25% to 1.
@Glenn-Terjesen Glenn-Terjesen changed the title feat!: common chart v2 feat!: common chart v2.0.0 Mar 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment