The CKS is the most advanced Kubernetes certification from CNCF — it requires a valid CKA to attempt and tests your ability to secure Kubernetes clusters in depth. This course covers all six CKS exam domains: cluster setup and hardening, system-level Linux security controls, minimizing microservice vulnerabilities, securing the software supply chain, and implementing runtime threat detection with Falco and Kubernetes audit logging.
Advanced7 modules~40 hours60 practice questionsRequires CKA
Jump straight into practice questions
60 scenario-based CKS questions covering every exam domain — free, no signup required.
The CKS is organized around layered security: cluster-level controls (NetworkPolicy, RBAC, API server flags) + node-level controls (AppArmor, seccomp, kernel modules) + workload-level controls (SecurityContext, Pod Security Admission) + runtime detection (Falco, audit logs). No single layer is sufficient — the exam tests whether you can apply all layers together.
🎧
Learn Kubernetes security on the go
Tune in to Falco deep dives, supply chain security walkthroughs, and CKS exam strategy discussions. New episodes every week — perfect for commutes and gym sessions.
CKA — cluster administration, must pass before attempting CKS
CKAD — application development and deployment on Kubernetes
CKS — security specialist, builds directly on CKA knowledge, hardest of the three
CKS is performance-based: you work in a live cluster via browser terminal, no multiple choice
You can use official Kubernetes docs (kubernetes.io) and some tool docs during the exam
Kubernetes Security Model Overview
Authentication — who are you? (certificates, tokens, OIDC)
Authorization — what can you do? (RBAC, Node authorization)
Admission Control — is this request valid? (PodSecurity, OPA Gatekeeper, webhooks)
Network Policies — what can this Pod communicate with?
SecurityContext — how is this container isolated from the host?
Runtime Security — what is happening at runtime? (Falco, audit logs)
The CKS exam is 2 hours for ~15–20 hands-on tasks. Speed matters. Practice all kubectl commands until they are muscle memory. The killer CKS tasks are Falco rules, audit policy, NetworkPolicy, and RBAC — expect at least one of each.
The CKS performance-based exam tests your ability to apply controls under time pressure, not just know them. Every concept in this course should be practiced hands-on in a cluster before exam day. Use killer.sh (included with your CKS voucher) for realistic mock exams.
Kubernetes Attack Surface & Threat Model
Common Attack Vectors in Kubernetes
Compromised container — exploit an app vulnerability, then escape to the host via privileged container, hostPath mount, or container runtime CVE
Stolen service account token — use the default SA token to call the Kubernetes API and enumerate/exfiltrate resources
SSRF to metadata endpoint — use server-side request forgery in an app to reach 169.254.169.254 and steal cloud credentials
Malicious container image — supply chain attack via compromised public image or CI pipeline
Overly permissive RBAC — SA with cluster-admin access lets an attacker control the whole cluster
The CKS exam doesn't present abstract questions — it gives you a scenario ("Pod X is running privileged, fix it with minimum change") and you apply the correct control. Always ask: what is the specific attack vector and what is the least-privilege control to address it?
02
Cluster Setup: Network Policies, TLS & API Hardening
3 lessons · ~6 hours
NetworkPolicy Deep Dive
NetworkPolicy Fundamentals
NetworkPolicy is namespaced — it only affects Pods in its own namespace
Default behavior: all ingress and egress is allowed if no NetworkPolicy selects the Pod
Once a NetworkPolicy selects a Pod, only explicitly allowed traffic is permitted
Requires a CNI plugin that enforces NetworkPolicy (Calico, Cilium, Weave) — kubectl apply of the policy has no effect with a non-enforcing CNI (e.g., Flannel)
Common NetworkPolicy Patterns
Default deny all ingress: spec: podSelector: {} policyTypes: [Ingress] — empty ingress rules = deny all
Restrict ingress to same namespace: use namespaceSelector: {matchLabels: {kubernetes.io/metadata.name: <ns>}}
Restrict egress to DNS + specific IPs: allow port 53 UDP/TCP + allow specific CIDRs
Block access to cloud metadata: egress deny ipBlock: cidr: 169.254.169.254/32
Example: Full Isolation Policy
policyTypes: [Ingress, Egress] with both sections empty → complete isolation
Allows adding targeted exceptions without global deny rules conflicting
For DNS access, always add: egress to port 53 UDP/TCP or pods will lose name resolution
On the CKS exam, NetworkPolicy tasks often involve fixing an existing policy that allows too much. Read the existing policy carefully before modifying — do not accidentally block DNS (port 53) or other required traffic. Test with kubectl exec <pod> -- curl <target> after applying.
After editing /etc/kubernetes/manifests/kube-apiserver.yaml, the kubelet automatically restarts the API server as a static Pod. Wait 30–60 seconds and verify with kubectl get pods -n kube-system. If the API server doesn't restart, check crictl ps -a and crictl logs <id> for errors in the manifest.
kube-bench reports PASS/FAIL/WARN. CKS tasks often say "fix CIS check 1.2.X". Know that most apiserver fixes are in /etc/kubernetes/manifests/kube-apiserver.yaml, kubelet fixes are in /var/lib/kubelet/config.yaml or /etc/kubernetes/kubelet.conf.
Ingress TLS & API Server Certificate Configuration
Restart API server: delete the static Pod or wait for kubelet to pick up the cert change
CKS exam TLS tasks often involve verifying a certificate's SAN with openssl x509 -in /etc/kubernetes/pki/apiserver.crt -text -noout | grep -A1 "Subject Alternative". Know how to read openssl output.
03
Cluster Hardening: RBAC, Service Accounts & Upgrades
3 lessons · ~6 hours
RBAC Minimum Privilege
Role vs ClusterRole, RoleBinding vs ClusterRoleBinding
Role — namespaced; grants permissions on resources in one namespace
ClusterRole — cluster-scoped; can be bound namespace-specifically via RoleBinding or globally via ClusterRoleBinding
RoleBinding — grants a Role or ClusterRole within a single namespace
ClusterRoleBinding — grants a ClusterRole across all namespaces
Least Privilege Pattern
Always use the most namespace-restricted binding possible
Only grant the minimum verbs needed: avoid * verbs or * resources
Dangerous verbs: escalate, bind, impersonate, create on ClusterRoleBindings
The cluster-admin ClusterRole has * on * — should never be used for service accounts
Useful RBAC Commands
Check what a SA can do: kubectl auth can-i list pods --as=system:serviceaccount:<ns>:<sa>
See all permissions: kubectl auth can-i --list --as=system:serviceaccount:<ns>:<sa>
Create a Role quickly: kubectl create role pod-reader --verb=get,list --resource=pods -n dev
Bind it: kubectl create rolebinding pod-reader-binding --role=pod-reader --serviceaccount=dev:myapp -n dev
CKS RBAC tasks often involve auditing an existing role for excessive permissions and restricting it. Use kubectl auth can-i --list to see what permissions an identity currently has before and after your fix.
Service Account Security
Disable Automatic Token Mounting
Default: every Pod gets a projected SA token at /var/run/secrets/kubernetes.io/serviceaccount/token
For Pods that don't need API access, disable: automountServiceAccountToken: false in the Pod spec
Or disable on the ServiceAccount object to apply to all Pods using it
Pod-level setting overrides SA-level setting — use Pod-level for surgical targeting
TokenRequest API & Projected Tokens
Legacy SA tokens (stored in Secrets): never expire, single audience, exist indefinitely
Projected tokens (TokenRequest API): bounded expiry (expirationSeconds), specific audience, tied to Pod lifetime
Default projected token expiry: 1 hour (kubelet refreshes before expiry)
Limits kubelet to: (1) only modify its own Node object, (2) only modify Pods assigned to it
Prevents a compromised node from modifying other nodes or other nodes' pods
Does NOT limit what a container can do — that's SecurityContext/AppArmor/seccomp
The most impactful SA hardening is disabling automount for Pods that don't need API access. In a real cluster, the vast majority of application Pods don't call the Kubernetes API — they just process business logic. Disabling the token eliminates the risk of SA token theft via container exploit.
Kubernetes Upgrade Security
Why Upgrades Are a Security Practice
Kubernetes releases patch CVEs in minor version updates — staying current is a security requirement
CVE examples patched by upgrades: kubelet privilege escalation, etcd information disclosure, API server SSRF
CKS exam expects you to know the correct upgrade order: control plane → worker nodes
Upgrade Process Review (kubeadm)
Check available versions: apt-cache madison kubeadm
One minor version at a time — never skip a version (1.28 → 1.29, not 1.28 → 1.30). Worker node versions must be ≤ control plane version. The exam may present a cluster that needs a specific patch-level upgrade for a security fix.
04
System Hardening: AppArmor, Seccomp & Linux Capabilities
3 lessons · ~6 hours
AppArmor for Containers
AppArmor Concepts
AppArmor (Application Armor) — Linux security module that restricts what a process can do using profiles
AppArmor profiles must be loaded on EVERY node where the Pod might schedule. If a profile is missing on a node, the Pod will fail to start with "failed to create containerd container: apply apparmor profile". Use a DaemonSet or node provisioning tool to distribute profiles in production.
Seccomp Profiles
Seccomp Concepts
Seccomp (Secure Computing Mode) — Linux kernel feature that filters which syscalls a process can make
Restricts the attack surface: even if an attacker has code execution, blocked syscalls can't be used
RuntimeDefault — the container runtime's default seccomp profile (recommended for most workloads)
Localhost — a custom profile file on the node at the seccomp profile path
Unconfined — no seccomp filtering (default before Kubernetes 1.27 if not set)
Configuring Seccomp in Kubernetes
Pod level: spec.securityContext.seccompProfile.type: RuntimeDefault
CIS 5.7.2: Ensure that the seccomp profile is set to docker/default or runtime/default
In Kubernetes 1.27+, the default profile is Unconfined unless set — always explicitly configure
Setting RuntimeDefault at the Pod level applies to all containers in the Pod
The CKS exam often presents a Pod with no seccomp profile set and asks you to add the RuntimeDefault. Know the exact YAML path: spec.securityContext.seccompProfile.type: RuntimeDefault — this is a common one-line fix task.
Linux Capabilities & Node OS Hardening
Linux Capabilities Minimum Set
Default container capabilities: a subset of the full Linux capability set (about 14 capabilities)
capabilities.drop: ["ALL"] — remove every capability from the container
capabilities.add: ["NET_BIND_SERVICE"] — add back only what is needed
Common capabilities to be aware of: NET_ADMIN (network config), SYS_PTRACE (process tracing), SYS_ADMIN (broad host access — almost as dangerous as privileged)
CIS Benchmark: do not add SYS_ADMIN, NET_ADMIN, or SYS_PTRACE unless absolutely required
The CKS exam often has a combined SecurityContext task: add seccomp RuntimeDefault + drop ALL capabilities + set allowPrivilegeEscalation=false + runAsNonRoot=true all at once. Know how to write these in a single clean securityContext block without looking up the syntax.
enforce rejects the API request — Pod is never created. warn allows the Pod but returns a warning header that CLIs display (useful for operator awareness). audit logs the violation to the audit log without user-visible feedback. Use all three simultaneously during migration: audit: restricted + warn: restricted + enforce: baseline.
Secrets Security: At Rest, In Transit & Access Patterns
Why Default Secrets Are Not Secure
Default: Secrets are stored as base64 in etcd — trivially decodable by anyone with etcd access
Environment variable exposure: visible in kubectl describe pod and /proc/<pid>/environ
Fix 1: volume-mount Secrets instead of env vars
Fix 2: enable encryption at rest with EncryptionConfiguration
Fix 3: use an external secrets manager (Vault Agent Injector, AWS Secrets Manager CSI)
Enabling Encryption at Rest
Create /etc/kubernetes/enc/encryption.yaml with provider aescbc or secretbox
Add flag to kube-apiserver: --encryption-provider-config=/etc/kubernetes/enc/encryption.yaml
Re-encrypt existing Secrets: kubectl get secrets -A -o json | kubectl replace -f -
Verify: check etcd directly — the stored value should start with k8s:enc:aescbc:v1:
RuntimeClass for Strong Isolation
gVisor (runsc): replaces the Linux kernel syscall interface with a Go-implemented sandbox
Kata Containers: runs each Pod in a lightweight VM with its own kernel
Create: kubectl create -f runtimeclass-gvisor.yaml with handler: runsc
Use: spec.runtimeClassName: gvisor in Pod spec
Verify: kubectl exec <pod> -- dmesg — gVisor shows different kernel info than host
For the CKS exam, know both the encryption-at-rest setup (EncryptionConfiguration YAML + apiserver flag) and the volume-mount-vs-env pattern. These are direct application tasks — no googling, just write the YAML.
Container Isolation: gVisor, Kata & Pod Security
Container Runtime Isolation Layers
Default runtime (containerd + runc): uses Linux namespaces and cgroups — shares the host kernel
gVisor (runsc): intercepts syscalls and handles them in userspace — kernel not directly exposed
Kata Containers: each Pod gets a micro-VM with a separate kernel — strongest isolation
RuntimeClass bridges the K8s scheduler to the correct low-level runtime handler on the node
Practical RuntimeClass Task
kubectl apply -f - <<'EOF' ... EOF to create the RuntimeClass quickly
Test: create a Pod with runtimeClassName: gvisor, exec in and run uname -r — shows gVisor kernel string
Admission webhook (OPA or Kyverno mutation) can auto-inject runtimeClassName for a namespace
On the CKS exam, you won't be asked to install gVisor — just to create the RuntimeClass and reference it. The runtime handler is already configured on the exam nodes. Your job is the Kubernetes configuration layer only.
06
Supply Chain Security: Scanning, Signing & Admission Control
3 lessons · ~6 hours
Image Scanning with Trivy
Trivy Core Commands
Scan a local image: trivy image nginx:1.25
Filter by severity: trivy image --severity HIGH,CRITICAL nginx:1.25
Fail on findings: trivy image --severity CRITICAL --exit-code 1 nginx:1.25
Scan cluster: trivy k8s --severity HIGH,CRITICAL --report all cluster
Best practice: scan before push (fail the build → image never enters registry)
Second layer: admission webhook to scan during pod admission
Third layer: periodic cluster scan with trivy k8s or Starboard operator
For CKS, know the Trivy command syntax cold — especially --exit-code 1 and --severity. You may be asked to scan a specific image and identify which CVE to report, or to run Trivy and interpret the output.
Image Signing with Cosign & Secure Images
Cosign Workflow
Generate a key pair: cosign generate-key-pair → produces cosign.key and cosign.pub
Sign an image: cosign sign --key cosign.key registry.example.com/myapp:v1.0
Signatures are stored as OCI artifacts in the registry alongside the image
Distroless and Minimal Images
Use FROM scratch for fully static Go/Rust binaries — zero OS, zero attack surface
Use gcr.io/distroless/static or distroless/base for apps needing minimal OS support
No shell → attacker can't run bash -i even with code execution
No package manager → can't install tools post-compromise
Multi-Stage Dockerfile Best Practice
Stage 1 (builder): FROM golang:1.22 AS builder — compile the binary
Stage 2 (final): FROM gcr.io/distroless/static — copy only the binary
Final image contains: binary + minimal OS libraries — nothing else
Sigstore's Cosign is becoming the industry standard for image signing. In the CKS exam context, know how to run cosign verify and understand what --key you need (the public key to verify a signature). Signing uses the private key; verification uses the public key.
Admission Control for Supply Chain
ImagePolicyWebhook
A Kubernetes-built-in admission webhook type specifically for image policy
Enable with: --enable-admission-plugins=ImagePolicyWebhook on the API server
defaultAllow: false → fail-closed (safe default — deny when webhook is down)
defaultAllow: true → fail-open (unsafe — allows all images if webhook unreachable)
OPA Gatekeeper for Registry Policy
Create a ConstraintTemplate with Rego that checks image registry prefix
Create a Constraint object applying it to Pod kinds across all or specific namespaces
Test with: kubectl apply -f test-pod.yaml — should be rejected with your Gatekeeper message
Validating vs Mutating Webhooks
Validating: can reject a request (no modification)
Mutating: can modify the request (inject sidecars, add labels, set runtimeClassName)
Execution order: mutating webhooks run first, then validating webhooks
CKS supply chain tasks often ask you to create or modify an OPA ConstraintTemplate Rego policy. Know the Gatekeeper violation pattern: violation[{"msg": msg}] { ... } and the input path: input.review.object.spec.containers[_].image.
fd.name — file descriptor full path (e.g., /etc/shadow)
proc.name — process name (e.g., sh, bash)
user.name — user running the process
container.name — container name
k8s.pod.name — Kubernetes Pod name
Common CKS Falco Rules
Shell in container: spawned_process and container and proc.name in (sh, bash, zsh, dash)
Read sensitive file: open_read and container and fd.name in (/etc/shadow, /etc/passwd)
Read env variables: open_read and container and fd.name=/proc/1/environ
Write to /etc: open_write and container and fd.directory=/etc
Falco uses its own condition language — not Prometheus PromQL, not OPA Rego. Know the specific Falco macros and field names. The most common exam mistake is using syscall.type or file.path (don't exist) instead of evt.type and fd.name.
For the CKS exam, you'll likely need to write or edit a Falco custom rule and verify it fires. Practice: edit /etc/falco/falco_rules.local.yaml, restart Falco (systemctl restart falco), trigger the condition, check journalctl -u falco -n 20. The entire workflow in under 5 minutes.
Kubernetes Audit Logging
Audit Policy Levels
None — do not log at all
Metadata — log request metadata only (user, timestamp, resource, verb) — no body
Request — log metadata + request body (captures write payloads)
RequestResponse — log metadata + request + response body (most verbose, captures secret values in reads)
Audit Policy Rule Ordering
Rules are evaluated top-to-bottom, first match wins
Put suppressions (level: None) BEFORE broad rules or they will never fire
Put the most specific rules (e.g., Secrets at RequestResponse) before broad fallback rules
Mount the audit policy file into the static Pod as a hostPath volume
CKS audit tasks typically give you a policy requirement (e.g., "log all secrets read with RequestResponse level, suppress configmap reads") and ask you to write the audit policy YAML, configure it on the apiserver, and verify the log is written. Watch the rule order — this is the most common mistake.
Immutable Infrastructure & Incident Response
Immutable Containers in Practice
readOnlyRootFilesystem: true — the root filesystem is read-only
Apps needing writes: use emptyDir volumes for /tmp, log directories, cache
Security advantage: any filesystem change = unambiguous attacker activity
Verify: kubectl exec <pod> -- touch /test.txt should fail with "Read-only file system"
Container Forensics on a Running Cluster
Never run commands inside a compromised container — you're executing attacker-controlled code
Use crictl on the node to inspect container state without entering it
The CKS is the only CNCF certification that directly tests your response to a simulated attack scenario. You may be given a cluster with a "compromised" pod and asked to identify indicators using audit logs and Falco, then apply the appropriate containment controls. Practice your forensic instincts, not just the configuration tasks.
Know how to query audit logs quickly: cat /var/log/kubernetes/audit.log | jq 'select(.verb=="delete" and .objectRef.resource=="secrets") | {user: .user.username, name: .objectRef.name, time: .requestReceivedTimestamp}'. Parsing JSON audit logs with jq is a core CKS exam skill.
Key Concept: Falco vs Audit Logs
Falco watches kernel syscalls in real time — it fires the moment a container opens a sensitive file or spawns a shell. Kubernetes Audit Logs capture Kubernetes API operations — who called the API, what resource was accessed, what was the response. Use Falco for runtime container behavior; use audit logs for Kubernetes control plane activity. The CKS tests both independently — they complement each other.
6-Week CKS Study Plan
Week 1
Prerequisites & Setup: Confirm your CKA is valid. Set up a Kubernetes cluster (kubeadm on VMs or kind + Calico). Review NetworkPolicy and RBAC fundamentals. Install Falco, kube-bench, and Trivy on your lab cluster.
Week 2
Cluster Setup & Hardening: Run kube-bench against your cluster and fix all FAIL items. Practice writing NetworkPolicy YAML for namespace isolation and metadata endpoint blocking. Configure the API server with all security flags. Practice RBAC tasks with kubectl auth can-i.
Week 3
System Hardening: Load AppArmor profiles, reference them in Pods. Configure seccomp RuntimeDefault and custom profiles. Write Pod specs with full SecurityContext hardening (drop ALL caps, readOnlyRootFilesystem, runAsNonRoot, allowPrivilegeEscalation=false). Create RuntimeClasses for gVisor scenarios.
Week 4
Microservice Vulnerabilities & Secrets: Apply Pod Security Admission to namespaces. Write OPA Gatekeeper ConstraintTemplates. Configure Secrets encryption at rest. Practice the Vault Agent Injector pattern. Apply all three layers (PSA + Gatekeeper + SecurityContext) together.
Week 5
Supply Chain Security: Scan images with Trivy in CI simulation. Sign images with Cosign and verify. Build multi-stage distroless Dockerfiles. Configure ImagePolicyWebhook (fail-closed). Write OPA policies for registry enforcement. Run trivy k8s cluster and fix findings.
Week 6
Runtime Security & Mock Exams: Write custom Falco rules and verify them. Configure Kubernetes audit logging with a tiered policy. Practice parsing audit logs with jq. Use killer.sh for 2 full mock exam sessions. Focus on speed — the real exam is 2 hours for 15–20 hands-on tasks.
Top 4 CKS Exam Mistakes
Forgetting DNS when writing NetworkPolicy: If you add an Egress deny-all policy without allowing port 53 UDP/TCP, DNS stops working and the Pod appears broken. Always add a DNS exception.
AppArmor profile not loaded on the right node: The profile must be present on every node where the Pod can schedule. If the profile isn't loaded, the Pod fails to start with a cryptic containerd error.
Audit policy rule order: Rules are first-match. A broad level: Metadata rule before your targeted level: None suppressions will catch everything. Always put specific suppressions first.
Using Falco field names that don't exist:file.path, syscall.type, and filename are not valid Falco fields. Use fd.name, evt.type, and proc.name. Always test rules by restarting Falco and checking journalctl.
CKS vs CKA — What's Different?
CKA — Administration
Cluster installation (kubeadm)
etcd backup & restore
Node maintenance and upgrades
Workload management (Deployments, rolling updates)