← Back to Guide
Automation & IaC L2 · PRACTICAL ~45 min

Python Script — Find Single-Replica Deployments

Use the official Kubernetes Python client to enumerate every Deployment across all namespaces, filter for single-point-of-failure workloads (replicas == 1), and produce a grouped report annotated with team ownership labels.

Objective

Platform engineering frequently involves scanning the cluster for risk patterns that policies alone can't prevent — like teams deploying with a single replica during off-hours. This exercise builds a standalone Python script using the kubernetes client library that identifies availability risk, groups results by namespace, and enriches the output with the owning team label. The script is the foundation for a recurring CronJob-based report or a Slack notification workflow.

Prerequisites

Steps

01

Install the Kubernetes Python client

# Install into a virtual environment (recommended)
python3 -m venv .venv
source .venv/bin/activate        # Windows: .venv\Scripts\activate

pip install kubernetes

# Verify
python3 -c "import kubernetes; print(kubernetes.__version__)"
## 28.1.0
The kubernetes Python client version tracks Kubernetes releases. Version 28.x supports K8s 1.28 API groups. Always pin the version in requirements.txt for reproducibility.
02

Create seed workloads for testing

# Create two test namespaces with single-replica Deployments
kubectl create namespace team-alpha 2>/dev/null || true
kubectl create namespace team-beta  2>/dev/null || true

kubectl -n team-alpha create deployment api-server --image=nginx:1.25 --replicas=1
kubectl -n team-alpha label deployment api-server \
  app.kubernetes.io/team=alpha-engineers

kubectl -n team-beta create deployment worker --image=nginx:1.25 --replicas=1
kubectl -n team-beta label deployment worker \
  app.kubernetes.io/team=beta-platform

# Multi-replica deployment should NOT appear in the report
kubectl -n team-alpha create deployment frontend --image=nginx:1.25 --replicas=3
kubectl -n team-alpha label deployment frontend \
  app.kubernetes.io/team=alpha-engineers
03

Write the detection script

find_single_replicas.py
#!/usr/bin/env python3
"""
Find all Deployments with replicas == 1 (single-point-of-failure).
Groups output by namespace and annotates with owning team label.
"""
import sys
from collections import defaultdict
from kubernetes import client, config

TEAM_LABEL = "app.kubernetes.io/team"
SPOF_THRESHOLD = 2   # flag deployments with fewer than this many replicas

def load_config():
    try:
        config.load_incluster_config()   # running inside a pod
    except config.ConfigException:
        config.load_kube_config()        # local ~/.kube/config

def get_single_replica_deployments():
    v1 = client.AppsV1Api()
    # list_deployment_for_all_namespaces fetches across every namespace
    deployments = v1.list_deployment_for_all_namespaces(watch=False)

    findings = defaultdict(list)

    for deploy in deployments.items:
        ns   = deploy.metadata.namespace
        name = deploy.metadata.name
        replicas = deploy.spec.replicas or 0
        labels   = deploy.metadata.labels or {}
        team     = labels.get(TEAM_LABEL, "<unlabelled>")

        # Skip system namespaces
        if ns.startswith(("kube-", "kyverno", "cert-manager")):
            continue

        if replicas < SPOF_THRESHOLD:
            findings[ns].append({
                "name":     name,
                "replicas": replicas,
                "team":     team,
            })

    return findings

def print_report(findings):
    if not findings:
        print("✓ No single-replica Deployments found.")
        return

    total = sum(len(v) for v in findings.values())
    print(f"\n⚠  Single-Replica Deployment Report ({total} found)\n")
    print(f"{'NAMESPACE':<20} {'DEPLOYMENT':<30} {'REPLICAS':<10} {'TEAM'}")
    print("-" * 80)

    for ns in sorted(findings):
        for item in sorted(findings[ns], key=lambda x: x["name"]):
            print(
                f"{ns:<20} {item['name']:<30} {item['replicas']:<10} {item['team']}"
            )
        print()   # blank line between namespaces

if __name__ == "__main__":
    load_config()
    findings = get_single_replica_deployments()
    print_report(findings)
    sys.exit(1 if findings else 0)  # non-zero exit for CI integration
04

Run the script and review output

python3 find_single_replicas.py

## ⚠  Single-Replica Deployment Report (2 found)
##
## NAMESPACE            DEPLOYMENT                     REPLICAS   TEAM
## --------------------------------------------------------------------------------
## team-alpha           api-server                     1          alpha-engineers
##
## team-beta            worker                         1          beta-platform
##

# frontend (3 replicas) should NOT appear above

# Exit code is 1 when violations found — useful in CI
echo "Exit code: $?"
## Exit code: 1
05

Export as JSON for downstream consumption

# Add JSON output mode to the script
# Replace print_report() call with this at the bottom of __main__:

import json, sys

if "--json" in sys.argv:
    output = []
    for ns, items in findings.items():
        for item in items:
            output.append({"namespace": ns, **item})
    print(json.dumps(output, indent=2))
else:
    print_report(findings)

# Run with JSON output
python3 find_single_replicas.py --json
## [
##   { "namespace": "team-alpha", "name": "api-server", "replicas": 1, "team": "alpha-engineers" },
##   { "namespace": "team-beta",  "name": "worker",     "replicas": 1, "team": "beta-platform" }
## ]

# Pipe to jq for further filtering (e.g. only unlabelled deployments)
python3 find_single_replicas.py --json | jq '[.[] | select(.team == "<unlabelled>")]'
06

Package as a Kubernetes CronJob

To run this check automatically, build a container image containing the script and schedule it as a CronJob. The ServiceAccount needs read access to Deployments cluster-wide.

rbac.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  name: spof-scanner
  namespace: platform-ops
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: deployment-reader
rules:
  - apiGroups: [apps]
    resources: [deployments]
    verbs: [get, list]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: spof-scanner
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: deployment-reader
subjects:
  - kind: ServiceAccount
    name: spof-scanner
    namespace: platform-ops
cronjob.yaml
apiVersion: batch/v1
kind: CronJob
metadata:
  name: spof-scanner
  namespace: platform-ops
spec:
  schedule: "0 8 * * 1-5"   # weekdays at 08:00
  jobTemplate:
    spec:
      template:
        spec:
          serviceAccountName: spof-scanner
          restartPolicy: OnFailure
          containers:
            - name: scanner
              image: your-registry/spof-scanner:latest
              command: [python3, /app/find_single_replicas.py, --json]
              resources:
                requests: { cpu: "50m", memory: "64Mi" }
                limits:   { cpu: "100m", memory: "128Mi" }

Success Criteria

Further Reading