Objective
Kubernetes audit logging records every API call made to the kube-apiserver. Without a policy file, only metadata-level events are logged and many critical operations are missed. This exercise writes a targeted audit policy, enables it on the API server, and builds a parsing script that produces a human-readable access report covering the events most relevant for compliance and incident investigation.
Prerequisites
- Access to the API server configuration (kubeadm cluster, or managed cloud audit logs)
- kubectl with cluster-admin permissions
- jq and Python 3.9+ for log parsing
- For managed clusters: AKS Diagnostic Settings, EKS CloudTrail/CloudWatch, or GKE Audit Logs enabled
Steps
Write the audit policy file
The policy is evaluated top-to-bottom. Rules match on verb, resource, and namespace. The level controls how much data is captured: None, Metadata, Request, RequestResponse.
apiVersion: audit.k8s.io/v1 kind: Policy rules: # Log secret reads and writes at RequestResponse level (captures the data) # WARNING: This may log secret values — ensure log storage is encrypted - level: Metadata # use RequestResponse only if compliance requires it resources: - group: "" resources: [secrets] verbs: [get, list, watch, create, update, patch, delete] # Log all RBAC changes at RequestResponse level (full object body) - level: RequestResponse resources: - group: rbac.authorization.k8s.io resources: - roles - rolebindings - clusterroles - clusterrolebindings verbs: [create, update, patch, delete] # Log pod creation/deletion (catch privileged pods, unusual images) - level: Request resources: - group: "" resources: [pods] verbs: [create, delete, patch] # Log namespace changes - level: Request resources: - group: "" resources: [namespaces] verbs: [create, delete, update] # Log exec and port-forward (interactive access — high value for forensics) - level: Metadata resources: - group: "" resources: [pods/exec, pods/portforward] # Log token creation (detect service account token abuse) - level: Metadata resources: - group: "" resources: [serviceaccounts/token] # Skip frequent, low-value reads to control log volume - level: None users: [system:kube-proxy] verbs: [watch] resources: - group: "" resources: [endpoints, services, nodes] - level: None userGroups: [system:authenticated] nonResourceURLs: [/api, /api/*, /apis, /apis/*, /healthz, /metrics] # Default: log metadata for everything else - level: Metadata
Enable audit logging on the API server
spec:
containers:
- command:
- kube-apiserver
# ... existing flags ...
- --audit-policy-file=/etc/kubernetes/audit-policy.yaml
- --audit-log-path=/var/log/kubernetes/audit/audit.log
- --audit-log-maxage=30 # retain 30 days
- --audit-log-maxbackup=10 # keep 10 rotated files
- --audit-log-maxsize=100 # rotate at 100 MB
volumeMounts:
- name: audit-policy
mountPath: /etc/kubernetes/audit-policy.yaml
readOnly: true
- name: audit-logs
mountPath: /var/log/kubernetes/audit
volumes:
- name: audit-policy
hostPath:
path: /etc/kubernetes/audit-policy.yaml
type: File
- name: audit-logs
hostPath:
path: /var/log/kubernetes/audit
type: DirectoryOrCreate# The kubelet will restart the API server pod automatically # Watch for it to come back watch kubectl get pods -n kube-system | grep apiserver # Verify audit logging is active ls -la /var/log/kubernetes/audit/audit.log # Trigger a test event kubectl get secret -n kube-system kubectl -n kube-system get secret default-token-xxxxx 2>/dev/null | head -2 # Confirm the event appears in the log tail -1 /var/log/kubernetes/audit/audit.log | jq '{verb, objectRef, user: .user.username}' ## { "verb": "get", "objectRef": { "resource": "secrets", "namespace": "kube-system" }, "user": "kubernetes-admin" }
Enable on managed Kubernetes (AKS / EKS / GKE)
## AKS — enable diagnostic logs to Log Analytics az monitor diagnostic-settings create \ --resource $(az aks show -g my-rg -n my-aks --query id -o tsv) \ --name aks-audit-logs \ --workspace $(az monitor log-analytics workspace show -g my-rg -n my-law --query id -o tsv) \ --logs '[{"category":"kube-audit","enabled":true},{"category":"kube-audit-admin","enabled":true}]' ## EKS — enable control plane logging aws eks update-cluster-config \ --region us-east-1 \ --name my-eks-cluster \ --logging '{"clusterLogging":[{"types":["audit","authenticator"],"enabled":true}]}' ## GKE — audit logging is enabled by default; export to BigQuery: gcloud logging sinks create k8s-audit-sink \ bigquery.googleapis.com/projects/PROJECT/datasets/k8s_audit \ --log-filter='resource.type="k8s_cluster" AND logName=~"cloudaudit"'
Parse the audit log into a weekly security report
import json, sys from collections import defaultdict from datetime import datetime, timezone, timedelta LOG_PATH = sys.argv[1] if len(sys.argv) > 1 else "audit.log" # Categories to surface in the report CATEGORIES = { "secret_access": lambda e: e.get("objectRef", {}).get("resource") == "secrets", "rbac_mutations": lambda e: e.get("objectRef", {}).get("resource") in ["roles", "rolebindings", "clusterroles", "clusterrolebindings"] and e.get("verb") not in ["get", "list", "watch"], "exec_access": lambda e: e.get("objectRef", {}).get("subresource") == "exec", "pod_creation": lambda e: e.get("objectRef", {}).get("resource") == "pods" and e.get("verb") == "create", } events_by_category = defaultdict(list) with open(LOG_PATH) as f: for line in f: try: event = json.loads(line) except: continue user = event.get("user", {}).get("username", "unknown") ts = event.get("requestReceivedTimestamp", "")[:19] obj = event.get("objectRef", {}) verb = event.get("verb", "unknown") for cat, match_fn in CATEGORIES.items(): if match_fn(event): events_by_category[cat].append({ "ts": ts, "user": user, "verb": verb, "ns": obj.get("namespace", "-"), "name": obj.get("name", "-"), }) print("=== Kubernetes Audit Log — Weekly Security Report ===") for cat, events in events_by_category.items(): print(f"\n── {cat.replace('_',' ').title()} ({len(events)} events)") print(f" {'TIMESTAMP':<20} {'USER':<30} {'VERB':<10} {'NS':<20} NAME") for e in events[:20]: # limit to 20 per category print(f" {e['ts']:<20} {e['user']:<30} {e['verb']:<10} {e['ns']:<20} {e['name']}") if len(events) > 20: print(f" ... and {len(events)-20} more")
python3 parse_audit_log.py /var/log/kubernetes/audit/audit.log ## === Kubernetes Audit Log — Weekly Security Report === ## ## ── Secret Access (47 events) ## TIMESTAMP USER VERB NS NAME ## 2026-03-15T08:11:23 kubernetes-admin get kube-system kube-proxy-token ## 2026-03-15T09:44:01 system:serviceaccount:ci:flux list production - ## ... ## ## ── Rbac Mutations (3 events) ## 2026-03-14T14:22:07 kubernetes-admin create default admin-binding
Success Criteria
Further Reading
- Kubernetes audit logging: kubernetes.io/docs/tasks/debug/debug-cluster/audit
- Audit policy reference: kubernetes.io/docs/reference/config-api/apiserver-audit.v1
- AKS audit logs: learn.microsoft.com/azure/aks/monitor-aks — diagnostic categories
- Falco + audit logs: falco.org/docs — use Falco rules against the audit event stream