Deploy kube-prometheus-stack and Write PromQL Queries

Objective

kube-prometheus-stack is the de facto standard monitoring stack for Kubernetes, bundling Prometheus, Alertmanager, Grafana, and a comprehensive set of pre-built dashboards and alerts. This exercise installs the stack and builds hands-on PromQL skills through five real-world queries that form the foundation of cluster health monitoring.

Prerequisites

Kubernetes cluster with at least 4GB RAM available
Helm 3 installed
kubectl port-forward capability
Basic understanding of Prometheus metrics model (labels, metric types)

Steps

Install kube-prometheus-stack

# Add the Prometheus community Helm chart
helm repo add prometheus-community \
  https://prometheus-community.github.io/helm-charts
helm repo update

# Create a values.yaml for minimal but functional setup
cat > prom-values.yaml << 'EOF'
prometheus:
  prometheusSpec:
    retention: 7d
    storageSpec:
      volumeClaimTemplate:
        spec:
          accessModes: ["ReadWriteOnce"]
          resources:
            requests:
              storage: 20Gi
grafana:
  enabled: true
  adminPassword: "admin123"
  persistence:
    enabled: true
    size: 5Gi
alertmanager:
  alertmanagerSpec:
    storage:
      volumeClaimTemplate:
        spec:
          accessModes: ["ReadWriteOnce"]
          resources:
            requests:
              storage: 5Gi
EOF

helm install kube-prometheus-stack \
  prometheus-community/kube-prometheus-stack \
  --namespace monitoring \
  --create-namespace \
  --values prom-values.yaml \
  --wait --timeout 10m

kubectl get pods -n monitoring

Access Prometheus and Grafana

# Port-forward Prometheus
kubectl port-forward svc/kube-prometheus-stack-prometheus \
  -n monitoring 9090:9090 &

# Port-forward Grafana
kubectl port-forward svc/kube-prometheus-stack-grafana \
  -n monitoring 3000:80 &

# Access URLs:
# Prometheus: http://localhost:9090
# Grafana:    http://localhost:3000  (admin / admin123)

# Verify targets are being scraped
# In Prometheus UI: Status → Targets → should show all UP

Query 1: Node memory pressure

Detects nodes under memory pressure. Values above 85% indicate risk of pod evictions.

# Node memory utilisation percentage (used / total)
(
  node_memory_MemTotal_bytes
  - node_memory_MemFree_bytes
  - node_memory_Buffers_bytes
  - node_memory_Cached_bytes
)
/ node_memory_MemTotal_bytes * 100

# Alert threshold: fire when any node exceeds 85%
# In Prometheus: Status → Rules to see pre-built node alerts

# Alternative: use kube_node_status_condition for pressure state
kube_node_status_condition{condition="MemoryPressure",status="true"}
# Value of 1 = node IS under memory pressure

Query 2: Pending pods

Pending pods indicate scheduling failures — insufficient resources, node selectors not matching, or PVCs not binding.

# Count of pods in Pending state
count by (namespace) (
  kube_pod_status_phase{phase="Pending"}
)

# More useful: pending pods grouped by reason
kube_pod_status_unschedulable == 1

# Pending pods that have been pending for over 5 minutes
# (potential stuck scheduling)
(
  kube_pod_status_phase{phase="Pending"}
  and on(pod, namespace)
  (time() - kube_pod_created) > 300
)

Query 3: API server request rate

The API server request rate shows cluster activity. Sudden spikes can indicate runaway controllers or misconfigured clients hammering the API.

# API server request rate by verb and resource (5-minute window)
sum by (verb, resource) (
  rate(apiserver_request_total[5m])
)

# API server error rate (4xx and 5xx)
sum by (code, verb) (
  rate(apiserver_request_total{code=~"[45].."}[5m])
)

# API server request latency P99 by verb
histogram_quantile(0.99,
  sum by (verb, le) (
    rate(apiserver_request_duration_seconds_bucket[5m])
  )
)

Query 4: PVC binding failures

Detects PersistentVolumeClaims stuck in Pending state — common when storage classes have issues or zones don't have available capacity.

# PVCs not in Bound state
kube_persistentvolumeclaim_status_phase{phase!="Bound"}

# PVCs that have been Pending for over 10 minutes
(
  kube_persistentvolumeclaim_status_phase{phase="Pending"}
  and on(persistentvolumeclaim, namespace)
  (time() - kube_persistentvolumeclaim_created) > 600
)

# Count of unbound PVCs by namespace
count by (namespace) (
  kube_persistentvolumeclaim_status_phase{phase!="Bound"}
)

Query 5: Container restart rate per namespace

High restart rates indicate OOMKilled containers, crash loops, or readiness probe failures. This query groups by namespace to identify which teams have unhealthy workloads.

# Container restart rate in the last hour by namespace
sum by (namespace) (
  increase(kube_pod_container_status_restarts_total[1h])
)

# Top 10 most restarting containers
topk(10,
  sum by (namespace, pod, container) (
    increase(kube_pod_container_status_restarts_total[1h])
  )
)

# Alert expression: namespace with >10 restarts in last hour
sum by (namespace) (
  increase(kube_pod_container_status_restarts_total[1h])
) > 10

Explore pre-built Grafana dashboards

# Navigate to Grafana at http://localhost:3000
# Key dashboards installed by kube-prometheus-stack:

# Dashboard ID 315: Kubernetes cluster monitoring
# Dashboard ID 6417: Kubernetes pods
# Dashboard ID 1860: Node exporter full

# In Grafana: Dashboards → Browse → Kubernetes
# Try: "Kubernetes / Compute Resources / Cluster"
# This shows all 5 of our queries as panels

# View currently firing alerts
kubectl port-forward svc/kube-prometheus-stack-alertmanager \
  -n monitoring 9093:9093 &
# Open http://localhost:9093

Success Criteria

kube-prometheus-stack deployed; all pods Running in monitoring namespace Prometheus UI accessible and all scrape targets show UP Grafana accessible and Kubernetes dashboards visible All 5 PromQL queries return results (not empty) in Prometheus UI Node memory utilisation query returns a value for each node Container restart query shows per-namespace breakdown At least one pre-built Grafana dashboard fully loaded with data