← Back to Guide
Operations L2 · PRACTICAL ~60 min

Application Deployment Best Practices Advisory

Review a deliberately flawed Deployment manifest, identify every production-readiness gap, and produce a corrected version with an advisory report explaining each change and its operational impact.

Objective

Platform engineers are often asked to review application team manifests before they reach production. This exercise trains you to read a Deployment YAML and spot reliability, security, and operability gaps — then communicate the findings in an advisory format that helps developers understand the why, not just the what.

Prerequisites

Steps

01

Review the flawed "before" manifest

This Deployment represents a real pattern submitted by application teams that have not yet gone through platform onboarding. Count the issues before reading the advisory table.

## ── BEFORE: app-team submission (do not deploy to production) ──
apiVersion: apps/v1
kind: Deployment
metadata:
  name: payment-api
  namespace: default       # ISSUE: deploying to default namespace
spec:
  replicas: 1              # ISSUE: single replica = no HA
  selector:
    matchLabels:
      app: payment-api
  template:
    metadata:
      labels:
        app: payment-api
    spec:
      containers:
      - name: payment-api
        image: myregistry.io/payment-api:latest   # ISSUE: mutable :latest tag
        ports:
        - containerPort: 8080
        env:
        - name: DB_PASSWORD
          value: "superSecretPassword123"            # ISSUE: plaintext secret in manifest
        - name: API_KEY
          value: "sk_live_abc123xyz"                 # ISSUE: plaintext secret in manifest
        # ISSUE: no resource requests or limits
        # ISSUE: no readiness probe — traffic sent before app is ready
        # ISSUE: no liveness probe — stuck pods never restarted
        # ISSUE: no securityContext — runs as root
  # ISSUE: no pod disruption budget
  # ISSUE: no update strategy configured (defaults to 25% maxUnavailable)
02

Score the manifest with kube-score (optional automated check)

# Install kube-score
curl -L https://github.com/zegl/kube-score/releases/download/v1.18.0/kube-score_1.18.0_linux_amd64.tar.gz \
  | tar xz kube-score
chmod +x kube-score && mv kube-score /usr/local/bin/

# Save the before manifest and score it
kubectl get deployment payment-api -o yaml > payment-api-before.yaml  # if already applied
# or save the before YAML above as payment-api-before.yaml

kube-score score payment-api-before.yaml

## Expected output (excerpt):
## [CRITICAL] Container Security Context
##   · payment-api → container securityContext.runAsNonRoot is not set
## [CRITICAL] Container Resources
##   · payment-api → CPU limit is not set
##   · payment-api → Memory limit is not set
## [CRITICAL] Pod Probes
##   · payment-api → no readiness probe is configured
## [WARNING] Deployment Strategy
##   · payment-api → maxSurge is not set
03

Advisory findings table

Document each finding with severity, impact, and recommendation before writing the fix. This is the format to use when communicating with application teams.

Finding Severity Operational Impact Recommendation
default namespace MEDIUM No RBAC isolation, no resource quotas, collides with other teams Deploy to dedicated namespace with ResourceQuota and LimitRange
replicas: 1 HIGH Single point of failure. Node drain or pod restart = downtime Set replicas: 3 with topologySpreadConstraints across zones
image: :latest HIGH Rollouts are not reproducible. Rollback impossible without registry tag history Pin to immutable SHA digest or semver tag (e.g. v1.4.2)
Plaintext secrets in env HIGH Secrets visible in etcd, kubectl get pod -o json, kubectl describe, CI logs Use secretKeyRef pointing to a Secret, or use External Secrets Operator
No resource requests HIGH BestEffort QoS — first to be evicted under node pressure. Scheduler cannot place optimally Set requests and limits based on p95 observed usage (Goldilocks recommended)
No readiness probe HIGH Traffic is sent to the pod before the app finishes starting. 502s during rollouts Add httpGet readiness probe on /healthz or /ready with initialDelaySeconds
No liveness probe MEDIUM Deadlocked pods are not restarted. Pod shows Running but stops serving requests Add httpGet liveness probe with failureThreshold: 3 and periodSeconds: 15
No securityContext HIGH Container runs as root (UID 0). Container escape grants root on host Set runAsNonRoot: true, readOnlyRootFilesystem: true, drop ALL capabilities
No PodDisruptionBudget MEDIUM Node drain can evict all replicas simultaneously causing full outage Create PDB with minAvailable: 1 or maxUnavailable: 1
Default update strategy LOW Default maxUnavailable:25% can take 1 replica offline during rolling update Set maxUnavailable: 0, maxSurge: 1 for zero-downtime rollout
04

Apply the corrected "after" manifest

## ── AFTER: production-ready corrected manifest ──
# First: create the namespace and secret
kubectl create namespace payments

kubectl create secret generic payment-api-secrets \
  -n payments \
  --from-literal=DB_PASSWORD='superSecretPassword123' \
  --from-literal=API_KEY='sk_live_abc123xyz'

# Apply the corrected Deployment
cat << 'EOF' | kubectl apply -f -
apiVersion: apps/v1
kind: Deployment
metadata:
  name: payment-api
  namespace: payments            # dedicated namespace
  labels:
    app: payment-api
    version: "1.4.2"
spec:
  replicas: 3                    # HA: 3 replicas across zones
  selector:
    matchLabels:
      app: payment-api
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 0           # zero downtime rollout
      maxSurge: 1
  template:
    metadata:
      labels:
        app: payment-api
        version: "1.4.2"
    spec:
      topologySpreadConstraints:
      - maxSkew: 1
        topologyKey: topology.kubernetes.io/zone
        whenUnsatisfiable: DoNotSchedule
        labelSelector:
          matchLabels:
            app: payment-api
      securityContext:
        runAsNonRoot: true
        runAsUser: 1000
        fsGroup: 2000
      containers:
      - name: payment-api
        image: myregistry.io/payment-api:v1.4.2   # pinned tag
        ports:
        - containerPort: 8080
        env:
        - name: DB_PASSWORD
          valueFrom:
            secretKeyRef:
              name: payment-api-secrets   # reference secret, not inline value
              key: DB_PASSWORD
        - name: API_KEY
          valueFrom:
            secretKeyRef:
              name: payment-api-secrets
              key: API_KEY
        resources:
          requests:
            cpu: "100m"
            memory: "128Mi"
          limits:
            cpu: "500m"
            memory: "256Mi"
        readinessProbe:
          httpGet:
            path: /ready
            port: 8080
          initialDelaySeconds: 10
          periodSeconds: 5
          failureThreshold: 3
        livenessProbe:
          httpGet:
            path: /healthz
            port: 8080
          initialDelaySeconds: 30
          periodSeconds: 15
          failureThreshold: 3
        securityContext:
          readOnlyRootFilesystem: true
          allowPrivilegeEscalation: false
          capabilities:
            drop: [ALL]
        volumeMounts:
        - name: tmp
          mountPath: /tmp              # writable tmp if app needs it
      volumes:
      - name: tmp
        emptyDir: {}
EOF

# Create the PodDisruptionBudget
cat << 'EOF' | kubectl apply -f -
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: payment-api-pdb
  namespace: payments
spec:
  minAvailable: 2
  selector:
    matchLabels:
      app: payment-api
EOF
05

Validate the corrected manifest

# Check pods are spreading across zones
kubectl get pods -n payments -o wide \
  -l app=payment-api

## NAME                           READY   STATUS    NODE       
## payment-api-7d9f4b8c6-4xkzp   1/1     Running   node-az1   
## payment-api-7d9f4b8c6-9mnrq   1/1     Running   node-az2   
## payment-api-7d9f4b8c6-vxpqr   1/1     Running   node-az3   

# Confirm QoS class is Burstable (requests != limits)
kubectl get pod -n payments -l app=payment-api \
  -o jsonpath='{.items[0].status.qosClass}'
## Burstable

# Confirm the pod is running as non-root
kubectl exec -n payments \
  $(kubectl get pod -n payments -l app=payment-api -o name | head -1) \
  -- id
## uid=1000 gid=0(root) groups=2000

# Re-score with kube-score to verify findings are resolved
kubectl get deployment payment-api -n payments -o yaml \
  | kube-score score -
## All checks passed (or only informational warnings remain)

# Verify PDB is protecting the deployment
kubectl get pdb -n payments
## NAME              MIN AVAILABLE   MAX UNAVAILABLE   ALLOWED DISRUPTIONS
## payment-api-pdb   2               N/A               1

# Simulate a rolling update (change image tag)
kubectl set image deployment/payment-api \
  payment-api=myregistry.io/payment-api:v1.4.3 \
  -n payments

# Watch zero-downtime rollout (maxUnavailable:0 means always 3 ready)
kubectl rollout status deployment/payment-api -n payments
## Waiting for deployment "payment-api" rollout to finish: 1 out of 3 new replicas have been updated...
## Waiting for deployment "payment-api" rollout to finish: 2 out of 3 new replicas have been updated...
## deployment "payment-api" successfully rolled out
In a real advisory workflow, the corrected manifest and findings table are delivered as a pull request comment or a platform team review document. The application team fixes their own manifest — you provide the knowledge transfer, not just the fix.
06

Automate advisory checks as a CI gate

Prevent future regressions by running automated policy checks in the CI pipeline. This converts advisory findings into enforced policy.

# Option 1: kube-score in CI (GitHub Actions)
# .github/workflows/manifest-review.yaml
- name: Score Kubernetes manifests
  run: |
    kube-score score deploy/*.yaml \
      --ignore-test container-image-tag \
      --output-format ci
  continue-on-error: false

# Option 2: Kyverno policy to block at admission time
# (combines several findings into cluster-enforced rules)
cat << 'EOF' | kubectl apply -f -
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: deployment-best-practices
spec:
  validationFailureAction: Enforce
  rules:
  - name: require-non-root
    match:
      any:
      - resources:
          kinds: [Pod]
    validate:
      message: "Containers must not run as root"
      pattern:
        spec:
          containers:
          - securityContext:
              runAsNonRoot: "true"
  - name: require-resource-requests
    match:
      any:
      - resources:
          kinds: [Pod]
    validate:
      message: "Resource requests must be set"
      pattern:
        spec:
          containers:
          - resources:
              requests:
                memory: "?*"
                cpu: "?*"
EOF

# Test: submitting the original bad manifest should now be rejected
kubectl apply -f payment-api-before.yaml
## Error from server: admission webhook "validate.kyverno.svc" denied the request:
## require-non-root: Containers must not run as root
## require-resource-requests: Resource requests must be set

Success Criteria

Further Reading