This guide helps you manually test the DeployGuard operator using a local Kind cluster and a test application.
Create a Kind cluster:
kind create cluster --name demoInstall Prometheus using Helm:
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
helm install prometheus prometheus-community/prometheus --namespace monitoring --create-namespaceWait for Prometheus to be ready:
kubectl wait --for=condition=available deployment/prometheus-server -n monitoring --timeout=120smake docker-build IMG=controller:latest
kind load docker-image controller:latest --name democd test/app
docker build -t testapp:latest .
kind load docker-image testapp:latest --name demo
cd ../..Install CRDs and deploy the operator:
make install
make deploy IMG=controller:latestWait for the controller to be ready:
kubectl rollout status deployment/deployguard-controller-manager -n deployguard-system --timeout=60sDeploy the sample application and RolloutGuard:
kubectl apply -f config/samples/sample_deployment.yaml
kubectl apply -f config/samples/guard_v1_rolloutguard.yamlVerify both are running:
kubectl get deployment sample-deployment
kubectl get rolloutguard rolloutguard-sampleThe test app exposes Prometheus metrics. You need to generate traffic to populate metrics:
# Port forward to the service
kubectl port-forward svc/sample-service 8080:80 &
# Generate continuous traffic (run in background or separate terminal)
while true; do curl -s localhost:8080 > /dev/null; sleep 0.1; donekubectl logs -n deployguard-system deploy/deployguard-controller-manager -f-
Trigger a normal rollout:
kubectl rollout restart deployment/sample-deployment
-
Watch the controller logs. You should see:
- "New rollout detected, initializing monitoring"
- "Within grace period, waiting for pods to warm up"
- "Metrics check" with violation=false
- Eventually "Monitoring window completed successfully"
-
Check RolloutGuard status:
kubectl get rolloutguard rolloutguard-sample -o yaml
-
Trigger a rollout that simulates high latency:
kubectl set env deployment/sample-deployment MODE=slow -
Generate traffic to the slow endpoint:
while true; do curl -s localhost:8080 > /dev/null; sleep 0.1; done
-
Watch the logs. You should see:
- "Metrics check" with latency exceeding threshold (>500ms)
- "Violation detected, accumulating" (counts up to 3)
- "Sustained violation detected, taking action"
- "Rolling back to last healthy revision"
- "Rollback applied successfully"
-
Verify the rollback:
# Check the deployment was rolled back (MODE should be gone or reset) kubectl get deployment sample-deployment -o yaml | grep -A2 "env:" # Check RolloutGuard status shows RolledBack kubectl get rolloutguard rolloutguard-sample -o jsonpath='{.status.phase}'
-
Reset the deployment to normal first:
kubectl set env deployment/sample-deployment MODE=normal # Wait for healthy state sleep 60
-
Trigger a rollout that simulates errors:
kubectl set env deployment/sample-deployment MODE=error -
Generate traffic:
while true; do curl -s localhost:8080 > /dev/null; sleep 0.1; done
-
Watch logs. The deployment should be rolled back due to error rate > 1%.
To test the Pause strategy instead of Rollback:
-
Update the RolloutGuard to use Pause:
kubectl patch rolloutguard rolloutguard-sample --type=merge -p '{"spec":{"violationStrategy":"Pause"}}' -
Trigger a violation (slow or error mode)
-
The deployment will be paused instead of rolled back:
kubectl get deployment sample-deployment -o jsonpath='{.spec.paused}' # Should output: true
-
Resume when ready:
kubectl rollout resume deployment/sample-deployment
View RolloutGuard status at any time:
kubectl get rolloutguard rolloutguard-sample -o yamlKey status fields:
phase: Current state (Idle, Monitoring, RolledBack, Paused, Healthy)observedLatency: Last observed P99 latency in millisecondsobservedErrorRate: Last observed error rate as percentageconsecutiveViolations: Number of consecutive threshold breacheslastHealthyRevision: ReplicaSet revision used for rollback
# Delete the test resources
kubectl delete -f config/samples/
# Undeploy the operator
make undeploy
# Delete the Kind cluster
kind delete cluster --name demo