Objective
Flagger automates progressive delivery by gradually shifting traffic to a canary deployment while monitoring real-time metrics. If the canary's error rate exceeds the threshold, Flagger rolls back automatically. This exercise wires Flagger to Prometheus, defines a custom metric template for 5xx rates, and demonstrates both a successful promotion and an automated rollback scenario.
Prerequisites
- Kubernetes cluster with Istio or NGINX Ingress Controller installed
- Prometheus + kube-prometheus-stack deployed
- Helm installed
- A test application that returns HTTP status codes you can control
Steps
Install Flagger with Prometheus support
# Add Flagger Helm repo helm repo add flagger https://flagger.app helm repo update # Install Flagger with NGINX mesh provider helm install flagger flagger/flagger \ --namespace flagger-system \ --create-namespace \ --set meshProvider=nginx \ --set metricsServer=http://prometheus-kube-prometheus-prometheus.monitoring:9090 \ --wait # Install Flagger's load tester for traffic generation helm install flagger-loadtester flagger/loadtester \ --namespace flagger-system kubectl get pods -n flagger-system
Deploy the primary workload that Flagger will manage
# Create namespace with Flagger label for auto-injection kubectl create namespace canary-demo # Deploy the primary deployment (Flagger will duplicate this as canary) cat << 'EOF' | kubectl apply -f - apiVersion: apps/v1 kind: Deployment metadata: name: podinfo namespace: canary-demo spec: replicas: 2 selector: matchLabels: {app: podinfo} template: metadata: labels: {app: podinfo} spec: containers: - name: podinfo image: stefanprodan/podinfo:6.5.0 ports: - containerPort: 9898 readinessProbe: httpGet: {path: /readyz, port: 9898} resources: requests: {cpu: 50m, memory: 64Mi} --- apiVersion: v1 kind: Service metadata: name: podinfo namespace: canary-demo spec: selector: {app: podinfo} ports: - port: 9898 targetPort: 9898 EOF
Create a MetricTemplate for 5xx error rate
# metric-template-5xx.yaml
cat << 'EOF' | kubectl apply -f -
apiVersion: flagger.app/v1beta1
kind: MetricTemplate
metadata:
name: not-found-percentage
namespace: flagger-system
spec:
provider:
type: prometheus
address: http://prometheus-kube-prometheus-prometheus.monitoring:9090
query: |
100 - sum(
rate(
http_requests_total{
namespace="{{ namespace }}",
pod=~"{{ target }}-[0-9a-zA-Z]+(-[0-9a-zA-Z]+)",
status!~"5.."
}[{{ interval }}]
)
) / sum(
rate(
http_requests_total{
namespace="{{ namespace }}",
pod=~"{{ target }}-[0-9a-zA-Z]+(-[0-9a-zA-Z]+)"
}[{{ interval }}]
)
) * 100
EOFCreate the Canary resource
The Canary resource tells Flagger how to shift traffic, what metrics to check, and when to roll back. Flagger will create a podinfo-canary deployment automatically.
# canary.yaml
cat << 'EOF' | kubectl apply -f -
apiVersion: flagger.app/v1beta1
kind: Canary
metadata:
name: podinfo
namespace: canary-demo
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: podinfo
service:
port: 9898
targetPort: 9898
analysis:
interval: 30s # Check metrics every 30 seconds
threshold: 5 # Max 5 failed checks before rollback
maxWeight: 50 # Max 50% traffic to canary
stepWeight: 10 # Increment by 10% each interval
metrics:
- name: request-success-rate
thresholdRange:
min: 99 # 99% success rate required
interval: 1m
- name: not-found-percentage
templateRef:
name: not-found-percentage
namespace: flagger-system
thresholdRange:
max: 5 # Max 5% 5xx error rate
interval: 30s
webhooks:
- name: load-test
url: http://flagger-loadtester.flagger-system/
timeout: 5s
metadata:
cmd: "hey -z 1m -q 10 -c 2 http://podinfo.canary-demo:9898/"
EOFTrigger a canary release
Update the deployment image to trigger Flagger's canary analysis. Flagger detects the change and starts incrementally routing traffic.
# Trigger canary by updating the image kubectl set image deployment/podinfo \ podinfo=stefanprodan/podinfo:6.5.1 \ -n canary-demo # Watch Flagger progress the canary watch -n 5 'kubectl describe canary podinfo -n canary-demo | \ grep -A20 "Status:"' # Watch traffic weights shift kubectl get canary podinfo -n canary-demo -w # STATUS WEIGHT FAILEDCHECKS # Progressing 10 0 # Progressing 20 0 # Progressing 30 0 # ... → Succeeded 0 0 (promoted to primary)
Simulate a bad release and trigger automatic rollback
The podinfo image supports a --faults flag that artificially returns 5xx errors. Update to a version that returns errors and watch Flagger roll back.
# Update to a "bad" version that returns errors kubectl set image deployment/podinfo \ podinfo=stefanprodan/podinfo:6.5.2 \ -n canary-demo # Immediately inject errors by patching the deployment kubectl patch deployment podinfo -n canary-demo \ --type=json \ -p='[{"op":"add","path":"/spec/template/spec/containers/0/command", "value":["./podinfo","--level=error","--status-code=500"]}]' # Watch Flagger detect failures and roll back watch -n 3 'kubectl get canary podinfo -n canary-demo && \ kubectl describe canary podinfo -n canary-demo | \ grep -E "Failed|rollback|Canary weight"' # Expected progression: # Progressing 10 1 # Progressing 10 2 # Progressing 10 5 ← threshold reached # Failed 0 5 ← rolled back!
Verify rollback and check events
# Verify rollback — primary should be back at the previous image kubectl get deployment podinfo -n canary-demo \ -o jsonpath='{.spec.template.spec.containers[0].image}' # Expected: stefanprodan/podinfo:6.5.1 (the previously good version) # View Flagger events for the full timeline kubectl get events -n canary-demo \ --field-selector involvedObject.name=podinfo \ --sort-by='.lastTimestamp' # Check Flagger logs for analysis details kubectl logs -n flagger-system \ -l app.kubernetes.io/name=flagger \ --tail=50 | grep podinfo
Success Criteria
Further Reading
- Flagger documentation — docs.flagger.app
- Flagger Canary analysis — docs.flagger.app/usage/how-it-works
- Flagger MetricTemplate — docs.flagger.app/usage/metrics
- Progressive Delivery whitepaper — weave.works/technologies/progressive-delivery