CNCF · devops

Certified Kubernetes Application Developer (CKAD)

Build, deploy, and operate cloud-native applications on Kubernetes. The developer-focused performance-based exam — kubectl drills for Pod design patterns, Deployments, ConfigMaps, probes, Services, NetworkPolicy, Helm, and PersistentVolumeClaims, all on the clock.

8Modules
30 hoursDuration
advancedLevel
CKADExam code
2 hoursDuration
66%Passing score
$445Exam fee (USD)
2 yearsValidity
Performance-basedFormat
Study on the go — CertQuests Podcast

Reinforce Pod design patterns, rolling-update strategies, and probe configurations while commuting. New CKAD-focused episodes drop weekly.

▶ Listen on Spotify

Why earn the CKAD?

CKAD is the CNCF credential for developers who ship to Kubernetes. It is hands-on, kubectl-fluent, and bound to a strict 2-hour timer — the exam that proves you can deploy, debug, and configure cloud-native apps on a live cluster.

  • Hands-on performance-based exam — you type real kubectl commands in a live terminal, not multiple choice
  • CNCF-recognized and vendor-neutral — same value at AWS EKS, GKE, AKS, on-prem, or any managed-K8s shop
  • The default developer-side Kubernetes credential — every cloud-native backend posting either asks for or favours it
  • Lighter cluster-admin scope than CKA — focused on workloads and developer ergonomics, not etcd or kubeadm
  • Gateway to platform engineering and senior backend roles ($110-150k US, €70-100k EU for K8s-fluent developers)
  • Pairs naturally with CKA (operate) or CKS (secure) — same exam style, same kubectl-fluency expectation
Exam strategy: the 2024+ blueprint weighs Application Environment, Configuration & Security 25%, Application Design & Build 20%, Application Deployment 20%, Services & Networking 20%, Observability & Maintenance 15%. There is no Troubleshooting domain — you debug to fix design / config / deployment problems. The killer.sh practice environment is bundled with your exam fee — use both attempts. Universal manifest scaffolder: kubectl <create|run> ... --dry-run=client -o yaml | tee file.yaml. Never hand-write YAML on the clock.

CKAD exam domains

Five domains. Configuration + Security is the heaviest at 25% — ConfigMaps, Secrets, SecurityContext, ServiceAccounts, and resource limits drive a quarter of the score. The other four domains sit close together (15–20%), so no module is a low-impact skip.

Domain 1 — Application Environment, Configuration & Security 25%
Domain 2 — Application Design & Build 20%
Domain 3 — Application Deployment 20%
Domain 4 — Services & Networking 20%
Domain 5 — Application Observability & Maintenance 15%

8 modules · ~30 hours

Each module maps to one or more exam domains. Work through them in order — Core Concepts and Configuration set up the mental model every later module assumes. Modules 5–8 (Pod Design, Networking, State, Helm) is where most exam points are won or lost.

01

Core Concepts3 lessons

The mental model every later module depends on. Master the control-plane / worker components, the Pod lifecycle, and the imperative-then-edit kubectl workflow that wins exam time. CKAD does not test you on installing a cluster — but if you don't know which component owns the symptom, you waste minutes debugging the wrong thing.

control-plane kube-apiserver etcd kubelet pod-lifecycle kubectl namespaces dry-run
~4h
📖 Read in-depth chapter
Lesson 1.1 Kubernetes architecture for developers

CKAD doesn't ask you to install a cluster, but it does expect you to know which component to blame when something goes wrong. A Pod stuck Pending is a scheduler symptom; a Pod CrashLooping is a kubelet symptom; an unreachable Service is a kube-proxy symptom. The mental model is your fastest debugger.

Key concepts
  • kube-apiserver: the front door for every kubectl call and controller action; authenticates, authorizes, validates, and persists desired state to etcd over HTTPS on port 6443.
  • etcd: the consistent key-value store that holds all cluster state — the source of truth. Backed up on real clusters; on CKAD you treat it as opaque storage.
  • kube-scheduler: watches Pods with no nodeName and picks a node based on requests, taints/tolerations, and affinity. A Pod stuck in Pending almost always means the scheduler can't satisfy the spec.
  • kube-controller-manager: runs control loops (Deployment, ReplicaSet, Job, EndpointSlice) that continuously reconcile desired vs actual state — this is what makes a Deployment self-heal.
  • kubelet + kube-proxy + container runtime: per-node agents. Kubelet starts and watches containers, kube-proxy maintains iptables/IPVS rules for Services, and the OCI-compliant runtime (containerd, CRI-O) actually runs the containers.
  • Diagnosis reflex: Pending → scheduler can't place it; ImagePullBackOff → kubelet can't fetch the image; CrashLoopBackOff → kubelet keeps restarting a failing container; cannot reach a Service → kube-proxy or NetworkPolicy.
Concrete example

Task: a Pod is stuck Pending. Inspect: kubectl describe pod webapp — the Events tail says "0/3 nodes are available: 3 Insufficient memory". That is the scheduler talking. Verify: kubectl describe nodes | grep -A 5 "Allocated resources" shows nodes already at 95% memory commitment. Fix: lower the Pod's resources.requests.memory from 2Gi to 512Mi. The next reconcile cycle places it. Same script, different opening event message, and you know which component to chase — that is the architecture model paying off on the clock.

Key takeaway: apiserver = gateway, etcd = data store, scheduler = Pod placement, controller-manager = reconciliation loops, kubelet/kube-proxy/runtime = per-node execution. Match the symptom to the component before you start typing kubectl commands at random.
⚡ Mini-quiz
Drill component-vs-symptom scenarios → study mode (10 questions).
Lesson 1.2 Pods & the Pod lifecycle

The Pod is the smallest deployable unit and every CKAD question lives or dies on knowing its anatomy. The five phases (Pending, Running, Succeeded, Failed, Unknown), the three restart policies, and the imperative shortcut to scaffold one are non-negotiable muscle memory.

Key concepts
  • Pod phases: Pending (accepted but not yet running — image pulling or unschedulable), Running (at least one container is up), Succeeded (all containers terminated with code 0), Failed (at least one container exited non-zero and won't restart), Unknown (kubelet lost contact).
  • Single vs multi-container: one Pod shares one network namespace, one IP, and any defined volumes. Multi-container Pods communicate via localhost and shared emptyDir — covered in depth in Module 3.
  • Pod YAML skeleton: apiVersion: v1, kind: Pod, metadata: { name, namespace, labels }, spec: { containers: [{ name, image, ports, env, resources, volumeMounts }] }. Memorize the shape.
  • Imperative scaffolder: kubectl run nginx --image=nginx --dry-run=client -o yaml > pod.yaml generates a valid manifest you then edit. Faster than typing YAML from scratch.
  • restartPolicy: Always (default, used by Deployments; restarts on any exit), OnFailure (used by Jobs; restarts only on non-zero exit), Never (never restarts — the container's exit is final).
  • Restart backoff: kubelet uses exponential backoff between restart attempts — 10s, 20s, 40s, capped at 5 minutes. CrashLoopBackOff is just the wait state, not a fatal error.
Concrete example

Task: deploy a single nginx Pod with the label tier=frontend and verify it reaches Running. Scaffold: kubectl run nginx --image=nginx --labels=tier=frontend --dry-run=client -o yaml > pod.yaml. Apply: kubectl apply -f pod.yaml. Watch: kubectl get pod nginx -w until it shows Running. Verify the label: kubectl get pod -l tier=frontend. If it stays Pending, kubectl describe pod nginx and read the Events tail — that's where the scheduler explains itself.

Key takeaway: never hand-write a Pod YAML. kubectl run --dry-run=client -o yaml is the universal scaffolder. The five phases + three restart policies are the Pod's whole behaviour model.
⚡ Mini-quiz
Practise Pod-lifecycle scenarios → quick quiz (5 questions).
Lesson 1.3 kubectl, namespaces & API primitives

CKAD is a typing exam. Your timer drains while you reach for documentation. Every minute spent hunting for a field name is one less minute spent solving the next question. kubectl explain, the imperative-then-edit reflex, and a sensible default namespace are how you reclaim those minutes.

Key concepts
  • Namespaces: logical partitions for namespaced resources. Defaults: default, kube-system, kube-public, kube-node-lease. Create with kubectl create namespace dev.
  • Resource scope: Pods, Deployments, Services, ConfigMaps, Secrets are namespace-scoped. Nodes, PersistentVolumes, ClusterRoles, Namespaces are cluster-scoped. List with kubectl api-resources --namespaced=true|false.
  • Set default namespace: kubectl config set-context --current --namespace=dev removes the need to repeat -n dev on every command. Always set it at the start of a multi-step exam task.
  • Core kubectl verbs: get (list), describe (deep dive + events), create (imperative create), apply -f (declarative create-or-update), edit (in-place edit), delete, explain (inline API docs).
  • Output formats: -o yaml (full spec), -o wide (extra columns: node, IP), -o jsonpath='{.status.containerStatuses[0].state}' (field-precise extraction).
  • API groups: core v1 (Pods, Services, ConfigMaps), apps/v1 (Deployments, StatefulSets, DaemonSets), batch/v1 (Jobs, CronJobs), networking.k8s.io/v1 (Ingress, NetworkPolicy). List with kubectl api-versions.
Concrete example

Task: scaffold a Deployment web with 3 replicas of nginx in a new namespace dev, then expose it as a ClusterIP. Setup: kubectl create namespace dev && kubectl config set-context --current --namespace=dev. Deployment: kubectl create deployment web --image=nginx --replicas=3. Service: kubectl expose deployment web --port=80 --target-port=80. Verify: kubectl get deploy,svc,pods -l app=web. Need a field you forgot? kubectl explain service.spec — inline docs, no browser.

Key takeaway: set the default namespace as your first action on any task. kubectl explain is your offline manual. The verbs get/describe/create/apply/edit/delete/explain cover 90% of the exam keystrokes.
⚡ Mini-quiz
Drill kubectl + namespace scenarios → study mode (10 questions).
02

Configuration3 lessons

The 25%-weight Configuration & Security domain — the single biggest scoring area. ConfigMaps and Secrets injected as env vars or volume mounts, resource requests + limits + QoS classes, SecurityContext at Pod + container scope, ServiceAccount tokens, and the immutability flag that production environments lean on. Most points on the exam pass through this module.

configmap secret requests-limits qos-classes security-context serviceaccount capabilities immutable
~5h
📖 Read in-depth chapter
Lesson 2.1 ConfigMaps & Secrets

ConfigMaps and Secrets are how you decouple configuration from container images. The CKAD always asks you to create one, then mount it three ways — as env vars, as projected env vars, and as a volume — and you need the imperative shortcuts in your fingers.

Key concepts
  • ConfigMap creation: kubectl create configmap app-config --from-literal=LOG_LEVEL=debug --from-literal=ENV=staging, or --from-file=app.properties, or --from-env-file=.env.
  • Secret creation: kubectl create secret generic db-creds --from-literal=user=admin --from-literal=password=s3cr3t. Types: Opaque (default), kubernetes.io/tls, kubernetes.io/dockerconfigjson.
  • Consume as env vars (per key): env: [{ name: LOG_LEVEL, valueFrom: { configMapKeyRef: { name: app-config, key: LOG_LEVEL } } }]. Secret variant uses secretKeyRef.
  • Consume as env vars (whole map): envFrom: [{ configMapRef: { name: app-config } }] — exposes every key as an env var with the same name.
  • Consume as volume: mount the ConfigMap/Secret at a path; each key becomes a file. Volume-mounted ConfigMaps auto-update on change (with a propagation delay of up to a minute); env-var ConfigMaps do not.
  • Immutable flag: immutable: true in the ConfigMap/Secret spec prevents changes (must be deleted and recreated to modify) and reduces apiserver watch load. Use for production configs that should never silently drift.
Concrete example

Task: create a ConfigMap with two keys, inject one as an env var and the other as a file. Create: kubectl create configmap web-config --from-literal=GREETING=hello --from-literal=index.html='<h1>Hi</h1>'. Pod manifest: scaffold with kubectl run web --image=nginx --dry-run=client -o yaml > web.yaml, then edit to add an env entry pulling GREETING from the ConfigMap, and a volumes + volumeMounts pair that maps index.html into /usr/share/nginx/html/. Verify: kubectl exec web -- env | grep GREETING and kubectl exec web -- cat /usr/share/nginx/html/index.html.

Key takeaway: Secrets are only base64-encoded, not encrypted. Prefer volume mounts over env vars for sensitive data — env vars leak into kubectl describe, logs, and crash dumps. Volume-mounted ConfigMaps hot-update; env-var ones do not.
⚡ Mini-quiz
Drill ConfigMap + Secret injection scenarios → study mode (10 questions).
Lesson 2.2 Resource requests, limits & QoS classes

Requests drive scheduling, limits drive throttling and OOM. The QoS class your Pod lands in (Guaranteed, Burstable, BestEffort) decides who gets evicted first when a node runs out of memory. On the exam, a Pending Pod or an OOMKilled container traces back here.

Key concepts
  • requests: the floor the scheduler reserves on the chosen node. limits: the ceiling the kubelet enforces — CPU over-limit = throttle, memory over-limit = OOMKill (exit 137).
  • YAML shape: resources: { requests: { cpu: "250m", memory: "128Mi" }, limits: { cpu: "500m", memory: "256Mi" } }. CPU 1000m = 1 core; memory units Ki / Mi / Gi (binary, preferred) or K / M / G (decimal).
  • QoS classes: Guaranteed (every container has requests = limits for both CPU + memory), Burstable (at least one resource has requests < limits or only one set), BestEffort (no requests/limits anywhere).
  • Eviction order: under MemoryPressure / DiskPressure, kubelet evicts BestEffort first, then Burstable (highest memory usage relative to request first), and only touches Guaranteed Pods as a last resort.
  • LimitRange: namespace-scoped policy that supplies default requests/limits to Pods that omit them, plus min/max constraints. Without it, a Pod with no resources falls into BestEffort.
  • ResourceQuota: caps the namespace's total requests.cpu, limits.memory, Pod count, etc. When a Quota is in effect for CPU/memory, every Pod must explicitly set requests + limits or it is rejected.
Concrete example

Task: a Pod is OOMKilled repeatedly. Diagnose: kubectl describe pod app shows Last State: Terminated, Reason: OOMKilled, Exit Code: 137. kubectl top pod app shows it's hitting 256Mi — exactly its limit. Fix: edit the Deployment to raise the limit (kubectl set resources deployment/app --limits=memory=512Mi), or shrink the application's working set. Verify QoS: kubectl get pod app -o jsonpath='{.status.qosClass}' — Burstable becomes Guaranteed if you also raise the request to match the limit.

Key takeaway: requests = what the scheduler reserves, limits = what kubelet enforces. Guaranteed (requests = limits) gives the strongest eviction protection. OOMKilled (exit 137) = memory limit hit; raise it or trim the app.
⚡ Mini-quiz
Practise resource + QoS scenarios → quick quiz (5 questions).
Lesson 2.3 SecurityContext & ServiceAccounts

SecurityContext is where the exam tests the "run as non-root, read-only filesystem, drop all caps" pattern that production K8s deployments live by. ServiceAccount is how a Pod authenticates to the apiserver — and getting the auto-mount flag wrong silently widens the blast radius of any workload compromise.

Key concepts
  • Pod-level SecurityContext: spec.securityContext: { runAsUser: 1000, runAsGroup: 3000, fsGroup: 2000, runAsNonRoot: true }. fsGroup chowns mounted volumes to the group so the app can write to them.
  • Container-level SecurityContext: overrides Pod-level. Key fields: runAsNonRoot: true (refuse to start if image specifies root), readOnlyRootFilesystem: true (block writes to /), allowPrivilegeEscalation: false, privileged: false.
  • Linux capabilities: drop everything then re-add what you need — capabilities: { drop: ["ALL"], add: ["NET_BIND_SERVICE"] }. Common adds: NET_BIND_SERVICE (port < 1024), SYS_TIME, CHOWN.
  • ServiceAccount basics: each namespace has a default SA. Pods authenticate to the apiserver using the SA's token. Create dedicated SAs for workloads that need API access: kubectl create serviceaccount deployer.
  • Assign SA to a Pod: spec.serviceAccountName: deployer. Pair with a RoleBinding granting the SA the exact API verbs it needs — never reuse default.
  • automountServiceAccountToken: set to false on the Pod (or the SA) for workloads that don't talk to the apiserver. Reduces attack surface — a compromised container with no token can't enumerate the API.
Concrete example

Task: harden an nginx Pod so it cannot run as root, cannot write to its root filesystem, and cannot escalate privileges. Scaffold: kubectl run web --image=nginx --dry-run=client -o yaml > web.yaml. Edit: add a container-level securityContext with runAsNonRoot: true, runAsUser: 101 (nginx's UID inside the image), readOnlyRootFilesystem: true, allowPrivilegeEscalation: false, capabilities.drop: ["ALL"] + add: ["NET_BIND_SERVICE"]. Add an emptyDir volume for /var/cache/nginx and /var/run since the root FS is now read-only. Verify: kubectl exec web -- id returns UID 101; kubectl exec web -- touch /tmp.lock fails.

Key takeaway: Pod-level securityContext applies to all containers + volumes; container-level overrides for a specific container. Always pair runAsNonRoot with an explicit runAsUser so the API server's check passes. Disable automountServiceAccountToken on Pods that don't need the API.
⚡ Mini-quiz
Drill SecurityContext + SA scenarios → study mode (10 questions).
03

Multi-Container Pods3 lessons

Multi-container Pods drive a chunk of Application Design (20%). The four patterns — Sidecar, Init, Ambassador, Adapter — each have a canonical use case the exam can dress up with different application names. Sidecar (log shipper, proxy), Init (wait-for + migration), Ambassador (smart proxy to external services), Adapter (output reshaper) — knowing which to reach for is half the battle.

sidecar init-container ambassador adapter shared-volume empty-dir
~3h
📖 Read in-depth chapter
Lesson 3.1 Sidecar pattern — log shippers, proxies, sync agents

A sidecar runs alongside the main container in the same Pod, sharing network and volumes. It is the workhorse of the multi-container patterns — Istio, Linkerd, Fluentd, and git-sync all use it. CKAD often asks you to "add a logging sidecar that reads /var/log/app.log and forwards it" — and that wording maps to a single canonical YAML shape.

Key concepts
  • Sidecar = same-Pod helper: two (or more) containers in spec.containers share the network namespace (localhost), the IPC namespace, and any volumes you define at Pod scope.
  • Log-shipper pattern: main container writes logs to a shared emptyDir at /var/log; sidecar (Fluentd, Fluent Bit, Vector) tails the files and forwards to Elasticsearch / Loki / S3.
  • Proxy sidecar: the sidecar (Envoy, HAProxy) intercepts inbound or outbound traffic for the main container — TLS termination, mTLS, retries, circuit breaking. Service meshes inject this automatically.
  • Sync sidecar: pulls data from an external source (Git repo, S3 bucket) into a shared volume the main container reads. k8s.gcr.io/git-sync/git-sync:v3 is the canonical example.
  • Shared emptyDir: the universal communication channel — declared once under spec.volumes, mounted at the right path in each container under volumeMounts. Lives for the Pod's lifetime, dies with it.
  • Lifecycle coupling: the Pod is "Running" while any container is up. To terminate the Pod cleanly, all containers must stop — design sidecars to exit on SIGTERM, not retry forever.
Concrete example

Task: add a busybox sidecar to an existing Pod so that the sidecar continuously prints the main container's log file. Edit the manifest: define an emptyDir volume shared-logs; mount it at /var/log in the main container; add a second container log-tailer with image busybox, command ['sh', '-c', 'tail -f /var/log/app.log'], and the same volumeMount. Apply with kubectl apply -f pod.yaml. Verify: kubectl logs <pod> -c log-tailer streams the main container's log. Two containers, one volume, zero application changes.

Key takeaway: the sidecar shape is always the same — declare the shared volume at Pod scope, mount it in both containers, design the sidecar to share state through the filesystem or localhost. emptyDir is the default channel; use medium: Memory for tmpfs when you care about speed.
⚡ Mini-quiz
Drill sidecar scenarios → study mode (10 questions).
Lesson 3.2 Init containers — wait-for, migrate, prepare

Init containers run one after another before the main containers start. They are the canonical answer to "wait for the database to be ready" and "run the schema migration before the app boots". On the exam, "the Pod must not start its main container until X is true" almost always means "use an init container".

Key concepts
  • Location: defined under spec.initContainers (not spec.containers). They run sequentially, in order, and each must exit 0 before the next starts.
  • Failure handling: if an init container fails, the kubelet restarts it per the Pod's restartPolicy. Main containers never start until all init containers succeed once.
  • Wait-for pattern: command: ['sh', '-c', 'until nslookup mysql; do echo waiting for db; sleep 2; done'] — blocks the Pod until the named Service resolves.
  • Migration pattern: run flyway migrate, rake db:migrate, or alembic upgrade head in an init container so the app boots into a fully-migrated schema. No race conditions, no startup-time migration in app code.
  • File-prep pattern: generate config from a template, set filesystem permissions, fetch a TLS cert — anything that must exist before the app reads from a shared volume.
  • Debugging: kubectl logs <pod> -c <init-container-name> to view init output; kubectl describe pod shows the init container's status (Init:0/2, Init:1/2, etc.).
Concrete example

Task: a web app must not start until the mysql Service is reachable. Edit the manifest: under spec.initContainers, add a container named wait-for-db with image busybox:1.28 and command ['sh', '-c', 'until nslookup mysql.default.svc.cluster.local; do echo waiting; sleep 2; done']. Apply. While MySQL is offline, kubectl get pod shows Init:0/1. As soon as MySQL's Service resolves, the init container exits 0 and the main container starts. Verify: kubectl logs <pod> -c wait-for-db shows the polling output.

Key takeaway: init containers run sequentially and must each succeed before the main containers start. Perfect for wait-for-service, database migrations, config templating, and permission setup. Status format Init:N/M in kubectl get pod tells you which init is running.
⚡ Mini-quiz
Practise init container scenarios → quick quiz (5 questions).
Lesson 3.3 Ambassador & adapter patterns

Ambassador and adapter are sidecar specializations the exam tests conceptually — "which pattern decouples the application from external service discovery" or "which pattern reshapes the application's output for a standardised consumer". Knowing the wording mapping is fast points.

Key concepts
  • Ambassador (outbound proxy): the sidecar proxies the main container's outbound calls to external services. The main container connects to localhost; the ambassador handles discovery, sharding, connection pooling, retry, protocol translation.
  • Ambassador example: a Redis client that always talks to localhost:6379; the ambassador sidecar routes to the correct Redis shard based on the key — the application stays simple.
  • Adapter (output reshaper): the sidecar transforms the main container's output into a standardized format. The main container is free to emit anything; the adapter exposes the canonical format.
  • Adapter example: a legacy app emits custom metrics on a TCP socket; the adapter sidecar consumes them and exposes /metrics in Prometheus format so the rest of the platform's tooling Just Works.
  • Pattern wording cheat-sheet: "shared volume + log forwarder" → Sidecar. "wait for X / set up before app starts" → Init. "main container talks to localhost, sidecar handles external" → Ambassador. "sidecar exposes a standard interface over the main app's output" → Adapter.
  • All three patterns share the sidecar mechanics — multiple containers in one Pod, shared volume or localhost. The pattern name describes the intent, not a different YAML shape.
Concrete example

Task: a legacy app exposes metrics on a custom TCP socket; the cluster monitors everything via Prometheus pull. Solution: add an adapter sidecar (e.g. a small Python container) that reads the TCP stream and exposes a /metrics HTTP endpoint in Prometheus exposition format. The application stays unchanged. A Service points Prometheus at the adapter's port. Identify it on the exam: the keyword is "reshape", "translate", "standardise" — that's adapter. If the keyword were "wait for", that's init; "log forwarder", that's sidecar; "proxy out to external service", that's ambassador.

Key takeaway: Sidecar, Init, Ambassador, Adapter are four names for the same shared-Pod mechanic. The intent matters more than the YAML — match the question's verb (wait, forward, proxy, reshape) to the right pattern.
⚡ Mini-quiz
Drill ambassador + adapter wording scenarios → study mode (10 questions).
04

Observability & Maintenance3 lessons

The 15%-weight Observability & Maintenance domain. Liveness / Readiness / Startup probes (the trio that misconfigured kills more deployments than anything else), kubectl logs with the right flags, and the describe → logs → exec → debug ladder you climb when a container refuses to behave. Light on weight, dense in points.

liveness-probe readiness-probe startup-probe kubectl-logs kubectl-top kubectl-debug events
~4h
📖 Read in-depth chapter
Lesson 4.1 Liveness, Readiness & Startup probes

Probes decide three things — when to restart your container (liveness), when to send it traffic (readiness), and when to wait longer for it to start (startup). Wire them wrong and your slow-booting app gets killed before it warms up, or your dead app keeps receiving requests. The exam tests the YAML shape and the failure-threshold math.

Key concepts
  • Liveness probe: if it fails failureThreshold times in a row, kubelet kills the container and restarts it. Use for processes that can hang or deadlock without crashing.
  • Readiness probe: if it fails, the Pod is removed from all Service endpoint sets (no traffic) but the container is not restarted. Use for warm-up time, dependency checks, or shedding load.
  • Startup probe: blocks liveness + readiness until it succeeds once. Use for slow boot times so liveness's short interval doesn't kill the container during initialization.
  • Probe types: httpGet { path, port } (2xx/3xx = success), tcpSocket { port } (connect = success), exec { command: ['cat', '/tmp/ready'] } (exit 0 = success), grpc { port } on Kubernetes 1.24+.
  • Tuning fields: initialDelaySeconds, periodSeconds (default 10), timeoutSeconds (default 1), successThreshold (default 1), failureThreshold (default 3). Total grace = periodSeconds × failureThreshold.
  • Anti-pattern: using the same endpoint for liveness and readiness — a dependency outage now restart-loops the app instead of just removing it from traffic. Keep them distinct.
Concrete example

Task: add a readiness probe that hits /healthz on port 8080, fails fast, and a liveness probe with a 30-second tolerance. Edit the container spec: readinessProbe: { httpGet: { path: /healthz, port: 8080 }, periodSeconds: 5, failureThreshold: 2 } (10s to remove from Service); livenessProbe: { httpGet: { path: /alive, port: 8080 }, initialDelaySeconds: 20, periodSeconds: 10, failureThreshold: 3 } (30s grace after warm-up). Verify: kubectl describe pod shows the probe definitions; kubectl get pod shows the Pod going Ready only after /healthz returns 200.

Key takeaway: readiness gates traffic (no restart), liveness gates restart, startup gates the other two. For slow-boot apps, prefer a startup probe over inflating liveness initialDelaySeconds — cleaner separation of concerns.
⚡ Mini-quiz
Drill probe-configuration scenarios → study mode (10 questions).
Lesson 4.2 Container logging — stdout, multi-container, sidecar

Kubernetes captures everything containers write to stdout / stderr. kubectl logs is the first tool you reach for, and the candidates who pass know its flags cold. The exam loves a CrashLoopBackOff scenario where the answer is in the previous instance's logs.

Key concepts
  • Basic forms: kubectl logs <pod> (last instance); -f follow; --previous see the last crashed container's logs; --tail=N last N lines; --since=10m time-windowed.
  • Multi-container: kubectl logs <pod> -c <container-name> required when more than one container exists. --all-containers=true shows every container's logs interleaved.
  • Init container logs: same syntax — kubectl logs <pod> -c <init-container-name>. Essential for diagnosing why Init:0/1 is stuck.
  • Stdout convention: 12-factor apps log to stdout/stderr only. Apps that log to files need a logging sidecar that tails the file and re-streams it, otherwise kubectl logs shows nothing.
  • Node-level storage: container logs live as JSON files under /var/log/containers/ on the node; kubelet rotates them by size. Cluster-level aggregation needs a DaemonSet (Fluentd / Fluent Bit) or a sidecar per Pod.
  • Selector-based logs: kubectl logs -l app=web --tail=20 --all-containers=true grabs logs from every Pod matching the selector. Faster than looping over kubectl get pods.
Concrete example

Task: a Pod is in CrashLoopBackOff and the current instance has nothing in its logs. Diagnose: kubectl get pod app -o wide shows 5 restarts. kubectl logs app is empty — the current container has just started and not yet emitted anything. Crucial flag: kubectl logs app --previous shows "FATAL: cannot connect to db: connection refused" from the last crashed instance. Fix: that's an init-container-wait-for-db problem (Module 3). Add the init container, redeploy, the loop stops.

Key takeaway: kubectl logs --previous is the single most under-used flag on the CKAD. CrashLoopBackOff hides the diagnostic message in the crashed instance — always check --previous. Multi-container Pods need -c.
⚡ Mini-quiz
Practise kubectl logs scenarios → quick quiz (5 questions).
Lesson 4.3 Monitoring & debugging — top, describe, debug

When logs aren't enough, you climb the debug ladder: getdescribelogstopexecdebug. Each rung shows you something the last one doesn't. Memorise the order, because under exam pressure you will skip a rung and waste five minutes.

Key concepts
  • kubectl top: kubectl top pod + kubectl top node show live CPU/memory. Needs Metrics Server installed (it is on every CKAD exam env). --containers breaks down per-container.
  • kubectl describe: dumps spec + status + conditions + Events. The Events tail at the bottom is where the scheduler, kubelet, and controllers explain themselves — always read it.
  • Events: chronological log of state changes — Scheduled, Pulled, Created, Started, Killing, ProbeFailed, BackOff. Cluster-wide: kubectl get events --sort-by=.metadata.creationTimestamp. Events expire after ~1 hour.
  • kubectl exec: kubectl exec -it <pod> -- /bin/sh drops you into the running container. With multi-container Pods add -c <container-name>. Useful for runtime inspection — env vars, mounted files, listening ports.
  • kubectl debug — ephemeral containers: for distroless / scratch images that lack a shell, kubectl debug -it <pod> --image=busybox --target=<container-name> injects a temp container into the running Pod and shares its process namespace.
  • kubectl debug — node: kubectl debug node/<node> -it --image=busybox creates a privileged Pod with the node's filesystem mounted — last-resort node-level diagnosis.
Concrete example

Task: a Pod from a distroless image keeps crashing with no logs. Climb the ladder: kubectl get pod shows CrashLoopBackOff. kubectl describe pod app Events tail says "Liveness probe failed: HTTP 503". kubectl logs app --previous — empty (distroless = no log output before the crash). Last rung: kubectl debug -it app --image=busybox --target=app, then inside the debug container ps shows the main process is alive but listening on the wrong port. Fix the probe's port field, redeploy.

Key takeaway: the debug ladder is get → describe → logs (--previous) → top → exec → debug. For distroless images, kubectl debug --target shares the target container's process namespace and gives you a real shell. kubectl describe Events tail is where 60% of bugs reveal themselves.
⚡ Mini-quiz
Drill debug-ladder scenarios → study mode (10 questions).
05

Pod Design & Workloads3 lessons

Application Design & Build (20%) and Application Deployment (20%) intersect here. Labels and selectors drive everything from Services to Deployments; Deployments + rolling updates + rollback are the default workload; Jobs and CronJobs cover one-shot and scheduled work. Pick the wrong workload kind and the rest of your manifest can be perfect but the test still fails.

labels selectors annotations deployment rolling-update rollback job cronjob
~5h
📖 Read in-depth chapter
Lesson 5.1 Labels, selectors & annotations

Labels are the connective tissue of Kubernetes. Services find Pods by label, Deployments own ReplicaSets by label, NetworkPolicies scope by label. The exam frequently asks you to add a label to a running resource, then verify a selector picks it up.

Key concepts
  • Labels: key-value pairs on any object — app: web, tier: frontend, env: prod. Add live: kubectl label pod nginx env=prod; remove: kubectl label pod nginx env-.
  • Equality selectors: =, ==, !=. CLI: kubectl get pods -l env=prod,tier=frontend (multiple are ANDed).
  • Set-based selectors: in, notin, exists. CLI: kubectl get pods -l 'env in (prod,staging)'; YAML: matchExpressions: [{ key: env, operator: In, values: [prod, staging] }].
  • matchLabels vs matchExpressions: Deployments and ReplicaSets use spec.selector.matchLabels (simple) or matchExpressions (richer). Whatever the selector says, the Pod template's labels must match — mismatch = the Deployment can't find its own Pods.
  • Recommended labels: app.kubernetes.io/name, /instance, /version, /component, /managed-by. Helm and Kustomize set these automatically — good hygiene to follow on your own resources.
  • Annotations: non-identifying metadata — build SHA, git commit, ingress controller hints (nginx.ingress.kubernetes.io/rewrite-target: /). Can be larger and richer than labels; not selectable.
Concrete example

Task: label every nginx Pod with env=prod, then list only the Pods that match. Bulk-label: kubectl label pods -l app=nginx env=prod. Verify: kubectl get pods -l env=prod -L env,tier shows the new label column. Selector test on a Service: kubectl get endpoints my-svc — only Pods matching the Service's spec.selector appear as endpoints. Change a Pod's label so the selector no longer matches and watch its IP drop from the endpoints — that is exactly how selectors gate traffic.

Key takeaway: Services + Deployments + NetworkPolicies all key off labels. Selectors must match the Pod template's labels exactly — mismatched selectors are silent bugs. kubectl get endpoints is the fast way to verify a Service is actually finding its Pods.
⚡ Mini-quiz
Drill label + selector scenarios → study mode (10 questions).
Lesson 5.2 Deployments — rolling updates, rollback, scaling

The Deployment is the default workload type on the CKAD. You need fluency in kubectl set image, kubectl rollout, and the rolling-update knobs (maxSurge + maxUnavailable) — they are tested on every attempt.

Key concepts
  • Imperative create: kubectl create deployment web --image=nginx:1.24 --replicas=3. Scaffold-then-edit: --dry-run=client -o yaml > web.yaml.
  • RollingUpdate strategy (default): maxSurge (extra new Pods beyond replicas), maxUnavailable (how many old Pods can be down). Defaults: 25% / 25%. Set to 0 / 25% for surge-free updates.
  • Recreate strategy: kills all old Pods before starting new — causes downtime. Use only when the app can't run two versions side by side (schema-incompatible migrations).
  • Update an image: kubectl set image deployment/web nginx=nginx:1.25 --record kicks off a rolling update. Watch with kubectl rollout status deployment/web.
  • Rollback: kubectl rollout history deployment/web lists revisions; kubectl rollout undo deployment/web reverts to the previous; --to-revision=2 jumps to a specific one.
  • Pause / resume / scale: kubectl rollout pause deployment/web freezes mid-rollout for inspection; resume continues. Scale: kubectl scale deployment web --replicas=5. Autoscale: kubectl autoscale deployment web --min=2 --max=10 --cpu-percent=70.
Concrete example

Task: deploy nginx with 4 replicas, update to a new image with zero unavailable Pods, then rollback. Create: kubectl create deployment web --image=nginx:1.24 --replicas=4. Edit the strategy: kubectl edit deployment web, set strategy.rollingUpdate.maxSurge: 1 and maxUnavailable: 0. Update: kubectl set image deployment/web nginx=nginx:1.25. Watch: kubectl rollout status deployment/web. Rollback (one liner): kubectl rollout undo deployment/web — instant revert, no manifest editing needed.

Key takeaway: kubectl set image + kubectl rollout undo is your two-command rollout toolkit. maxSurge/maxUnavailable let you choose between fast (more surge) and resource-light (more unavailable). Always verify with kubectl rollout status.
⚡ Mini-quiz
Practise Deployment + rollout scenarios → quick quiz (5 questions).
Lesson 5.3 Jobs & CronJobs — one-shot and scheduled work

Deployments are for long-running services. Jobs are for one-shot work that runs to completion (data migration, batch job). CronJobs schedule Jobs on cron syntax. The exam always asks for parallel, retry-on-failure, or history-limit variants — knowing the YAML fields cold is the difference between 30 seconds and 5 minutes per question.

Key concepts
  • Job basics: apiVersion: batch/v1, kind: Job. Imperative: kubectl create job migrate --image=migrator:1.0. Pod's restartPolicy must be OnFailure or Never.
  • Completions + parallelism: spec.completions: 5 + parallelism: 2 = run until 5 successful exits, with up to 2 Pods active at a time. Default: 1/1.
  • backoffLimit: total number of failed retries before the Job is marked Failed. Default 6. Set to 0 for "fail fast, don't retry".
  • activeDeadlineSeconds: kills the Job after N seconds regardless of progress. Belt-and-suspenders against runaway batches.
  • CronJob basics: kind: CronJob, spec.schedule: "*/5 * * * *" (every 5 minutes). Imperative: kubectl create cronjob hello --image=busybox --schedule="*/5 * * * *" -- echo hi.
  • Concurrency + history: concurrencyPolicy: Forbid|Allow|Replace, successfulJobsHistoryLimit: 3, failedJobsHistoryLimit: 1. Forbid prevents overlap when a Job overruns the cron interval.
Concrete example

Task: run a Job that prints "hello" exactly 4 times, with up to 2 Pods at once, and fails fast (no retries). Scaffold: kubectl create job greet --image=busybox --dry-run=client -o yaml -- echo hello > job.yaml. Edit: add spec.completions: 4, parallelism: 2, backoffLimit: 0. Apply + watch: kubectl apply -f job.yaml && kubectl get pods -w -l job-name=greet — 2 Pods at a time, 4 successes, Job marked Complete. Now wrap it on a 5-minute cron: kubectl create cronjob greet-cron --image=busybox --schedule="*/5 * * * *" -- echo hello.

Key takeaway: Job = run-to-completion, CronJob = scheduled Job. Pod restartPolicy must be OnFailure or Never. completions/parallelism/backoffLimit/activeDeadlineSeconds are the four knobs you'll be asked to set.
⚡ Mini-quiz
Drill Job + CronJob scenarios → study mode (10 questions).
06

Services & Networking3 lessons

The 20%-weight Services & Networking domain. Services (ClusterIP / NodePort / LoadBalancer / ExternalName) glue your Pods to the rest of the world; NetworkPolicy is how you firewall them off; Ingress is the HTTP front door. CKAD doesn't test you on building a CNI — it tests you on picking the right Service type and writing an ingress + policy YAML.

clusterip nodeport loadbalancer endpoints network-policy ingress dns
~4h
📖 Read in-depth chapter
Lesson 6.1 Services — ClusterIP, NodePort, LoadBalancer, ExternalName

A Service gives a stable IP + DNS name to a set of Pods selected by label. Pick the wrong type and your app is either unreachable or accidentally public — both lose exam points.

Key concepts
  • ClusterIP (default): stable virtual IP reachable only from inside the cluster; backed by kube-proxy iptables/IPVS rules. DNS name: my-svc.<ns>.svc.cluster.local.
  • NodePort: exposes the Service on a static port (30000–32767) on every node's IP. Reach with http://<any-node-ip>:<nodePort>. Implies a ClusterIP underneath.
  • LoadBalancer: provisions a cloud load balancer (AWS NLB, GCP LB, Azure LB) and routes external traffic to the Service. Implies NodePort + ClusterIP. On bare metal needs MetalLB or similar.
  • ExternalName: no selector, no endpoints — just a DNS CNAME from inside the cluster to an external hostname (spec.externalName: api.example.com). Useful for migrating off external dependencies.
  • Imperative expose: kubectl expose deployment web --port=80 --target-port=8080 --type=ClusterIP. --type=NodePort or LoadBalancer to switch.
  • Endpoints: kubectl get endpoints my-svc shows the Pod IPs the Service is routing to. Empty = your spec.selector doesn't match any Pod labels. First thing to check when a Service appears broken.
Concrete example

Task: expose a 3-replica nginx Deployment externally on port 30080 of every node. Create: kubectl create deployment web --image=nginx --replicas=3. Expose: kubectl expose deployment web --port=80 --target-port=80 --type=NodePort, then kubectl edit svc web and set nodePort: 30080. Verify: kubectl get svc web -o wide shows the nodePort; kubectl get endpoints web shows 3 Pod IPs; curl http://<any-node-ip>:30080 returns the nginx welcome page.

Key takeaway: ClusterIP = internal only, NodePort = port on every node, LoadBalancer = cloud LB, ExternalName = DNS alias. Always verify with kubectl get endpoints — empty endpoints means a selector mismatch.
⚡ Mini-quiz
Drill Service-type scenarios → study mode (10 questions).
Lesson 6.2 NetworkPolicy — micro-segmentation

Pods can talk to each other by default. NetworkPolicy is how you flip that — implement zero-trust networking by declaring which Pods can talk to which. The exam loves "allow only frontend → backend on port 8080" style asks.

Key concepts
  • Default behaviour: with no NetworkPolicy in the namespace, all Pods can reach all Pods. Apply a single Policy with podSelector: {} + policyTypes: [Ingress] and no rules — that denies all ingress to all Pods.
  • Targeting: spec.podSelector picks the Pods this policy applies to (in this namespace). {} = all Pods in the namespace.
  • Ingress rules: spec.ingress[].from + .ports. from can be podSelector (same ns), namespaceSelector (other ns), ipBlock (CIDR), or a combination ANDed within one from entry, ORed across entries.
  • Egress rules: mirror — spec.egress[].to + .ports. Don't forget DNS — egress to kube-system's CoreDNS on TCP/UDP 53 or your Pods can't resolve names.
  • policyTypes: always include both Ingress and Egress when you define egress rules, otherwise the egress block is ignored.
  • CNI must support it: NetworkPolicy enforcement requires a CNI that implements it (Calico, Cilium, Antrea). On a CNI without enforcement (flannel default), policies are silently ignored — verify by trying a blocked connection.
Concrete example

Task: in namespace app, allow only Pods labelled tier=frontend to reach Pods labelled tier=backend on TCP 8080. Deny everything else. Manifest: NetworkPolicy with podSelector: { matchLabels: { tier: backend } }, policyTypes: [Ingress], and one ingress rule allowing from: [{ podSelector: { matchLabels: { tier: frontend } } }] on ports: [{ protocol: TCP, port: 8080 }]. Verify: kubectl exec -it frontend-pod -- curl backend-svc:8080 succeeds; kubectl exec -it other-pod -- curl backend-svc:8080 times out.

Key takeaway: NetworkPolicy is implicit-deny — the moment any policy selects a Pod, only the explicit allow rules apply. Always remember egress to DNS or your Pods go silent. The CNI must support enforcement.
⚡ Mini-quiz
Practise NetworkPolicy scenarios → quick quiz (5 questions).
Lesson 6.3 Ingress — HTTP routing & TLS

An Ingress is an L7 routing resource that fronts multiple Services with hostname / path rules and TLS termination. Without an Ingress controller running (nginx, Traefik, HAProxy), Ingress objects are inert — the controller is what actually translates them into routing rules.

Key concepts
  • Ingress vs Service: a Service exposes Pods on a port; an Ingress routes HTTP/HTTPS by hostname + path to one of several Services. Single LoadBalancer in front of many Services.
  • IngressClass: picks which controller services the Ingress. spec.ingressClassName: nginx. Without one, the cluster's default IngressClass handles it.
  • Rules shape: spec.rules[].host: shop.example.com + http.paths[] with { path: /api, pathType: Prefix, backend: { service: { name: api-svc, port: { number: 80 } } } }.
  • pathType: Exact (literal match), Prefix (string-prefix match), ImplementationSpecific (controller's choice). Default-prefer Prefix.
  • TLS termination: spec.tls: [{ hosts: [shop.example.com], secretName: shop-tls }]. The Secret must be of type kubernetes.io/tls with keys tls.crt and tls.key.
  • Imperative create: kubectl create ingress shop --rule="shop.example.com/api*=api-svc:80" --rule="shop.example.com/=web-svc:80" — multiple rules in one shot.
Concrete example

Task: route shop.example.com/api/* to api-svc:80 and shop.example.com/* to web-svc:80. Scaffold: kubectl create ingress shop --rule="shop.example.com/api*=api-svc:80" --rule="shop.example.com/*=web-svc:80" --dry-run=client -o yaml > ingress.yaml. Apply. Test: with DNS or a host header, curl -H 'Host: shop.example.com' http://<ingress-ip>/api/healthz hits api-svc; curl -H 'Host: shop.example.com' http://<ingress-ip>/ hits web-svc.

Key takeaway: Ingress is an L7 router. Path-based routing uses pathType: Prefix; TLS termination needs a Secret of type kubernetes.io/tls. Without an Ingress controller installed, the Ingress object does nothing.
⚡ Mini-quiz
Drill Ingress scenarios → study mode (10 questions).
07

State Persistence2 lessons

The lightest exam slice but a guaranteed appearance — emptyDir vs hostPath vs PVC, the PV/PVC binding dance, StorageClass for dynamic provisioning, and the StatefulSet basics that every database deployment leans on. Storage is where "the Pod started but writes go to nowhere" silently happens.

empty-dir host-path persistent-volume pvc storage-class access-modes statefulset
~3h
📖 Read in-depth chapter
Lesson 7.1 Volumes — emptyDir, hostPath, PV & PVC

Volumes solve "the container's filesystem dies with the container." The CKAD exam covers three: emptyDir (scratch space tied to the Pod's life), hostPath (node-local, niche), and PersistentVolumeClaim (durable, dynamically provisioned). Knowing which to reach for is half the question.

Key concepts
  • emptyDir: created when a Pod is assigned to a node, deleted when the Pod is removed. Shared by all containers in the Pod. Use for scratch, cache, sidecar communication. medium: Memory backs it with tmpfs.
  • hostPath: mounts a node-local path into the Pod. Types: Directory, File, DirectoryOrCreate, FileOrCreate. Useful for accessing node-level files (Docker socket); avoid for application data because the Pod is tied to a single node.
  • PersistentVolume (PV): a cluster-scoped storage resource — either pre-provisioned (static) or created on demand (dynamic). Has a size, an access mode, and a StorageClass.
  • PersistentVolumeClaim (PVC): a namespace-scoped request for storage. The control plane binds the PVC to a matching PV — or dynamically creates one via the requested StorageClass.
  • Access modes: ReadWriteOnce (RWO) = one node read-write; ReadOnlyMany (ROX) = many nodes read-only; ReadWriteMany (RWX) = many nodes read-write (NFS, CephFS); ReadWriteOncePod (RWOP) = exactly one Pod (K8s 1.27+).
  • StorageClass & reclaimPolicy: StorageClass binds a provisioner (AWS EBS, GCE PD, CSI driver) + parameters. reclaimPolicy: Retain keeps the PV after the PVC is deleted; Delete removes both PV and backing storage.
Concrete example

Task: deploy a Pod with persistent storage that survives Pod restarts. Create PVC: apiVersion: v1, kind: PersistentVolumeClaim, metadata: { name: data }, spec: { accessModes: [ReadWriteOnce], resources: { requests: { storage: 1Gi } }, storageClassName: standard }. Mount it: in the Pod spec, volumes: [{ name: data, persistentVolumeClaim: { claimName: data } }], then volumeMounts: [{ name: data, mountPath: /var/data }]. Verify: write a file inside the Pod, delete the Pod, recreate it (same PVC reference), confirm the file is still there.

Key takeaway: emptyDir dies with the Pod, hostPath ties the Pod to a node, PVC is the only durable option. PVC stuck in Pending = no PV satisfies its size + access mode + StorageClass — check with kubectl describe pvc.
⚡ Mini-quiz
Drill volume + PVC scenarios → study mode (10 questions).
Lesson 7.2 StatefulSets — stable identity for stateful apps

Deployments give you N anonymous, interchangeable replicas. Databases and distributed systems need the opposite — stable names, ordered startup, individual persistent storage. That's StatefulSet. CKAD doesn't dig deep into operator-pattern internals, but you should be able to write the YAML and explain the three guarantees.

Key concepts
  • Stable Pod identity: Pods are named <sts-name>-0, <sts-name>-1, ... — ordinal indices that survive reschedules. mysql-0 stays mysql-0 even after the node it ran on dies.
  • Ordered startup & shutdown: Pods are created sequentially from 0 to N-1, each waiting for the previous to be Ready. Scale-down reverses — highest ordinal first. Critical for primary-replica setups.
  • volumeClaimTemplates: defines a PVC template — each Pod gets a unique PVC named <template-name>-<sts-name>-<ordinal>. The Pod re-attaches to the same PVC after reschedule, preserving data. PVCs are not auto-deleted when the StatefulSet is.
  • Headless Service: required for DNS. spec.clusterIP: None on a Service that selects the StatefulSet's Pods. Each Pod gets a DNS record: <pod-name>.<svc-name>.<ns>.svc.cluster.local.
  • Update strategies: RollingUpdate (default) — updates Pods in reverse-ordinal order; the partition field enables canary-style updates by only touching Pods with ordinal ≥ partition. OnDelete — manual control, you delete Pods to trigger their replacement.
  • Three guarantees: stable network identity, stable persistent storage per ordinal, ordered + graceful deployment / scaling. Memorise that list — it is the canonical "why StatefulSet not Deployment" answer.
Concrete example

Task: deploy a 3-replica StatefulSet for nginx with each Pod getting its own 1Gi PVC. Headless Service first: kind: Service, metadata: { name: nginx-svc }, spec: { clusterIP: None, selector: { app: nginx }, ports: [{ port: 80 }] }. StatefulSet: spec.serviceName: nginx-svc, replicas: 3, selector.matchLabels: { app: nginx }, template with the same labels, and volumeClaimTemplates: [{ metadata: { name: data }, spec: { accessModes: [ReadWriteOnce], resources: { requests: { storage: 1Gi } } } }]. Verify: kubectl get pvc shows data-nginx-0, data-nginx-1, data-nginx-2; nslookup nginx-0.nginx-svc from another Pod resolves.

Key takeaway: StatefulSet = stable identity + per-Pod storage + ordered ops. Always pair with a headless Service. PVCs persist after the StatefulSet is deleted — clean them up explicitly if you want them gone.
⚡ Mini-quiz
Practise StatefulSet scenarios → quick quiz (5 questions).
08

Helm & Application Deployment Patterns2 lessons

The last slice of Application Deployment (20%) — Helm chart basics, plus the advanced patterns the exam asks about conceptually: blue-green, canary, Kustomize overlays. Helm + Kustomize are both available in the exam environment via helm and kubectl apply -k — use whichever the question signals.

helm chart values.yaml kustomize blue-green canary apply-vs-create
~2h
📖 Read in-depth chapter
Lesson 8.1 Helm basics — charts, releases, values, rollback

Helm is the de-facto package manager for Kubernetes — and a 2021+ addition to the CKAD curriculum. You should be able to install, upgrade, list, and roll back a release; override values; and inspect a chart's defaults.

Key concepts
  • Chart: a package of templated YAML manifests + a Chart.yaml metadata file + a values.yaml of defaults. Release: one named install of a chart with a specific set of values.
  • Repositories: helm repo add bitnami https://charts.bitnami.com/bitnami; helm repo update; helm search repo nginx. Public catalogue: Artifact Hub.
  • Install: helm install my-nginx bitnami/nginx -n web --create-namespace. Override defaults: --set service.type=NodePort or -f my-values.yaml (file is more readable for multiple overrides).
  • Upgrade: helm upgrade my-nginx bitnami/nginx --set image.tag=1.25. Add --install to install-if-missing in one command (idempotent CI step).
  • Inspect & rollback: helm list shows releases; helm history my-nginx lists revisions; helm rollback my-nginx 2 reverts to revision 2. Helm stores revisions as Secrets in the release's namespace.
  • Preview before applying: helm template ./chart --values my-values.yaml renders the manifests locally without installing. Add --dry-run --debug on install/upgrade to preview what would be applied.
Concrete example

Task: install the bitnami/nginx chart as release web in namespace shop, override the Service type to NodePort, then upgrade the image tag. Add repo: helm repo add bitnami https://charts.bitnami.com/bitnami && helm repo update. Install: helm install web bitnami/nginx -n shop --create-namespace --set service.type=NodePort. Upgrade: helm upgrade web bitnami/nginx -n shop --set image.tag=1.25 --reuse-values. Verify: helm history web -n shop shows two revisions; kubectl get svc -n shop shows NodePort type. Rollback if needed: helm rollback web 1 -n shop.

Key takeaway: Helm install/upgrade/rollback is the workflow. --reuse-values preserves your overrides across upgrades. helm template + --dry-run --debug are how you preview without risk.
⚡ Mini-quiz
Drill Helm scenarios → study mode (10 questions).
Lesson 8.2 Advanced deployment patterns — blue-green, canary, Kustomize

Blue-green and canary are how production teams roll out new versions without taking downtime. The CKAD doesn't require you to implement them from scratch, but it does test the conceptual model and the YAML mechanics. Kustomize is the built-in alternative to Helm — overlays without templating.

Key concepts
  • Blue-green: run two complete environments (blue = current, green = new). Test green in isolation, then switch traffic by updating the Service's selector to point at green. Rollback = flip the selector back. Requires double the resources during transition.
  • Canary: run two Deployments with the same Service selector label but different replica counts (e.g. stable: 9, canary: 1 = 10% canary traffic). Increase the canary's replicas as confidence grows. No double-resource cost.
  • kubectl apply vs create: create is imperative — fails if the resource exists. apply -f is declarative — creates or updates via three-way merge (last-applied annotation, live state, new config). Use apply for production / GitOps; use create for one-shot exam tasks.
  • Kustomize basics: built into kubectl. kustomization.yaml with resources (base manifests), namePrefix, commonLabels, patches, configMapGenerator. Apply: kubectl apply -k ./. Preview: kubectl kustomize ./.
  • Overlay pattern: a base/ directory with shared manifests, plus overlays/dev/ and overlays/prod/ each with their own kustomization.yaml that references the base and applies environment-specific patches. Same manifests, different values, no templating.
  • Production hygiene: always set requests + limits, configure probes, define a podDisruptionBudget (PDB) to protect availability during voluntary node drains, set terminationGracePeriodSeconds for clean shutdowns, use preStop hooks to drain in-flight connections.
Concrete example

Task: do a canary deployment of nginx:1.25 alongside an existing nginx:1.24 Deployment, with 10% of traffic on the canary. Setup: existing Deployment web-stable with labels { app: web, track: stable } and 9 replicas. Canary: create a second Deployment web-canary with image nginx:1.25, labels { app: web, track: canary }, and 1 replica. Service: spec.selector: { app: web } — matches both Deployments, distributes traffic ~9:1. Validate: kubectl get endpoints web-svc shows 10 IPs (9 stable + 1 canary). Promote: scale web-canary up + web-stable down in steps, or do the full cutover.

Key takeaway: blue-green = two complete environments + selector switch. Canary = shared selector + different replica counts. apply is declarative + idempotent; create is imperative + one-shot. kubectl apply -k activates Kustomize without installing anything.
⚡ Mini-quiz
Practise advanced deployment scenarios → quick quiz (5 questions).

The 2-hour battle plan

15-19 scenarios in 120 minutes — a hard ~6-8 minutes per question. Triage hard, set context once, scaffold with --dry-run=client -o yaml every time, verify before moving on.

  1. First 5 minutes — set context once: kubectl config set-context --current --namespace=<ns-from-the-question>. Alias alias k=kubectl and export do='--dry-run=client -o yaml' (e.g. k run x --image=nginx $do).
  2. Triage: skim every question; flag the long / multi-step ones; do the 1-2 minute easy wins first. Don't leave any question blank — partial credit exists.
  3. Scaffold, don't write: every Pod, Deployment, Service, Job, ConfigMap, Secret, Ingress has a kubectl create or kubectl run imperative form + --dry-run=client -o yaml. Use it. Edit the YAML, then kubectl apply -f.
  4. Verify every answer: kubectl get, kubectl describe, kubectl exec -- env, kubectl get endpoints. The scoring script checks the cluster, not your YAML — if your manifest didn't actually create the object correctly, you score 0 even if it looks right.
  5. Use kubectl explain instead of the docs tab — it stays in your terminal, no context switch, no scrolling. kubectl explain pod.spec.containers.resources dumps every field.
  6. Don't fight a question: if you're 4 minutes in and stuck, mark it, skip it, move on. The killer.sh practice attempts (2 included with the exam fee) train you to feel that 4-minute mark.

Top 5 mistakes that fail CKAD candidates

  • Hand-writing YAML. Every minute on indentation is a minute you don't have. Scaffold imperatively, then edit.
  • Forgetting the namespace. The question specifies a namespace; you run kubectl in default. The scoring script doesn't find your resource. Always set-context --namespace first.
  • Selector / label mismatch. Deployment's spec.selector.matchLabels doesn't match the template's labels. The Deployment can't find its Pods. kubectl describe deploy shows the mismatch immediately.
  • Skipping verification. "Looks right" isn't right. kubectl get endpoints for Services; kubectl exec -- env for ConfigMaps; kubectl logs --previous for crashed Pods.
  • Not reading the Events tail. 60% of "this doesn't work" answers are in kubectl describe's last block. Read it before doing anything else.

After CKAD

CKAD slots into the CNCF Kubernetes triad. Natural next moves depend on which side of the wall you want to be on.

  • CKA — Certified Kubernetes Administrator: same exam style, operator-side scope. kubeadm install + upgrade, etcd backup + restore, RBAC, troubleshooting (30% of CKA, vs nothing on CKAD). The natural pivot for engineers moving from app-side to platform-side.
  • CKS — Certified Kubernetes Security Specialist: requires an active CKA. Security-focused: RBAC deep dive, Pod Security Standards, network policy enforcement, runtime security, supply chain (image signing, SBOM). Higher salary band, narrower audience.
  • Cloud-managed K8s certs: AWS EKS specialty deep-dives, GCP Professional Cloud Architect (K8s is a domain), Azure CKAD-equivalent baked into AZ-104. Useful if your shop is single-cloud.
Start practicing → Open Cert Quest path