How does Kubernetes API priority and fairness keep the API server stable at scale?

APF (API Priority and Fairness) replaces max-inflight throttling. Requests are classified into FlowSchemas which map to PriorityLevelConfigurations with a share of concurrency. A misbehaving controller spamming list calls is queued in its own flow without starving kube-scheduler or kubelet heartbeats. At >500 nodes you tune APF to give system:masters and kubelet headroom; first sign of trouble is X-Kubernetes-PF-FlowSchema-UID and APF rejection metrics in the apiserver.

Are namespaces enough for multi-tenant isolation in Kubernetes?

No. Namespaces are a naming scope, not a security boundary. Hard multi-tenancy requires: separate node pools per tenant (taints + tolerations + RuntimeClass with gVisor or Kata), strict NetworkPolicies + a CNI that enforces them, ResourceQuotas + LimitRanges, separate ServiceAccounts with no cluster-admin grants, and ideally vcluster or capsule for control-plane isolation. For untrusted workloads, give each tenant its own cluster.

How do you reduce Kubernetes cost without dropping reliability?

Start with measurement: OpenCost or Kubecost to attribute spend per namespace and workload. Then four levers: right-size requests against actual usage (most clusters over-request CPU 3-5x), enable bin-packing via Karpenter or pod topology spread, use spot/Spot Fleet for stateless workloads with PDBs and preStop hooks, and consolidate dev/staging clusters. Reserved or Savings Plans on the steady-state baseline complete the picture. Targets vary but 30-50% reduction is normal on a first pass.

Interview Prep · Senior · Published June 2026

Top 10 advanced CKA interview questions for senior platform & staff SRE loops in 2026

Q: Cluster Autoscaler vs Karpenter — when do you pick which in 2026?

Cluster Autoscaler scales node groups (ASGs/MIGs) up and down based on Pending pods, predictable but limited to instance shapes you predefine. Karpenter provisions nodes directly via the EC2 fleet API, picking instance shape based on pending pod shape — usually 20-40% cheaper and faster at scale on AWS. Pick CA on GKE/AKS or when you need strict instance-type control; pick Karpenter on EKS when you want cost-optimal bin packing and faster node startup.

Q: How do you debug a pod stuck in ContainerCreating for 10 minutes?

kubectl describe pod for Events. Common causes in order: ImagePullBackOff (private registry + missing pull secret, or registry rate limit), CNI failure (IPAM exhausted, especially on AWS VPC CNI when the ENI/IP pool is small), volume attach timeouts (EBS volumes from a different AZ than the node), and pending node provisioning when Karpenter is still creating the node. journalctl -u kubelet on the node reveals CNI and CRI errors that kubectl describe sometimes hides.

Published June 2, 2026 · ~8 min read · No CNCF, Linux Foundation, or training-vendor revenue

SeniorTarget level

5+ yrsK8s production exp

$180–240kSenior platform US

$220–320kStaff SRE US base

TL;DR — the 30-second version

The base CKA interview proves you can run a cluster. The senior interview proves you can run twenty. These ten questions are what comes up in 2026 staff SRE, senior platform, and Kubernetes architect loops — the level above the standard CKA interview prep. They test design judgment, scale economics, and the operational debugging that only shows up at the 500-node tier.

If you’re prepping for the base CKA exam, start with our standard CKA interview questions and CKA ROI breakdown first.

The 10 questions

1. Cluster Autoscaler vs Karpenter — when do you pick which?

Cluster Autoscaler scales pre-defined node groups (ASGs on AWS, MIGs on GCP, VMSS on Azure) up and down based on Pending pods. Predictable, well-understood, but constrained to instance shapes you decided in advance and slower to react (90 s+ to a new node).

Karpenter provisions nodes directly via the cloud fleet API, picking the cheapest instance type that fits the pending pod shape. On EKS in 2026 it usually wins on cost (20–40% bin-packing improvement) and node startup time (30–60 s). Trade-off: more moving parts, and the abstraction can hide bad pod requests — a workload requesting 64 vCPU will silently spin up an m7i.16xlarge.

Pick CA on GKE/AKS or when compliance pins you to specific instance families. Pick Karpenter on EKS when cost matters and your platform team can own one more controller.

2. How does API priority and fairness keep the API server alive at scale?

APF replaced the legacy max-inflight throttle in v1.20 and graduated to GA in v1.29. Requests are classified by a FlowSchema (matched on user, group, namespace, verb, resource) into a PriorityLevelConfiguration with a share of total concurrency.

When a misbehaving controller starts spamming list pods --all-namespaces, its requests are queued or rejected inside its own flow — kubelet heartbeats and the scheduler keep flowing. The signals: apiserver_flowcontrol_rejected_requests_total, the X-Kubernetes-PF-FlowSchema-UID response header, and rising apiserver_request_duration_seconds on system flows. At >500 nodes, expect to tune APF to give system-leader-election and kube-system more headroom.

3. Are namespaces enough for multi-tenant isolation?

No, and confidently saying so is the answer interviewers want. Namespaces are a naming scope, not a security boundary. Hard multi-tenancy needs:

Compute isolation: separate node pools per tenant via taints + tolerations + node selectors. For untrusted code, layer a sandboxed RuntimeClass (gVisor, Kata Containers).
Network isolation: default-deny NetworkPolicies plus a CNI that enforces them (Cilium, Calico). Add a service mesh for mTLS between namespaces.
Resource isolation: ResourceQuotas + LimitRanges per namespace, plus PriorityClasses so a noisy tenant can’t starve a control-plane workload.
Identity isolation: separate ServiceAccounts, no cluster-admin grants, OPA Gatekeeper or Kyverno to enforce policy.
Control-plane isolation: vcluster or Capsule when each tenant needs their own apparent control plane.

For genuinely hostile workloads — SaaS running customer code — the only honest answer is one cluster per tenant.

4. Walk me through cutting Kubernetes cost 30% without dropping reliability.

Order matters. Skip measurement and the rest is theatre.

Measure: OpenCost or Kubecost to attribute spend by namespace, workload, and label. Most platform teams discover 20–40% of spend goes to one dev cluster nobody owns.
Right-size requests: VPA in recommend-only mode for two weeks, then walk teams through the diff. Most pods over-request CPU 3–5x. This alone clears 20%.
Bin-pack: Karpenter or Cluster Autoscaler with consolidation, topology-spread constraints to avoid wasted nodes.
Spot/Preemptible: stateless workloads with PDBs, preStop hooks, and a small on-demand floor. 60–80% discount on the spot portion.
Commit: Reserved Instances or Savings Plans on the steady-state baseline only — never on the variable layer.

30% on a first pass is normal. 50% is achievable on a cluster nobody’s ever tuned.

5. A pod is stuck in `ContainerCreating` for 10 minutes. Senior debug.

kubectl describe pod Events first. Then in order of probability at scale:

Image pull: private registry pull secret missing, or Docker Hub anonymous rate limit (100 pulls/6 h per IP). Check with kubectl get events --field-selector reason=Failed.
CNI / IPAM exhaustion: on AWS VPC CNI, ENI IP pool runs out. Karpenter spinning up a t3.small with only 4 IPs available is a classic. Read journalctl -u kubelet on the node.
Volume attach: EBS in us-east-1a, pod scheduled to a node in us-east-1b. kubectl get pv + node zone annotation tells you.
Admission webhook timing out: Kyverno or OPA with a strict policy and a slow remote check. apiserver_admission_webhook_admission_duration_seconds.
Node still provisioning: Karpenter is mid-launch. kubectl get nodeclaim shows it.

6. Designing a zero-downtime upgrade for a 500-node multi-AZ production cluster.

Build the surge node group first: a parallel pool one minor version ahead, empty. Validate the control plane upgrade in a staging cluster against your CRDs and admission webhooks — v1.32 removed several legacy APIs that v1.29 manifests may still use. Then on prod:

Upgrade control plane (managed services do this for you in 5–15 min). Watch apiserver_request_total for 4xx spikes from client-go versions that don’t match.
Roll workers: replace the old node group rather than in-place upgrade. kubectl drain --ignore-daemonsets --delete-emptydir-data --grace-period=300. Honor PDBs.
Watchdog: SLO-burn-rate alerts on every critical workload during the roll. Bail if any breaches.
Soak: 48 h on the new minor before greenlighting the next environment.

7. CRDs and Operators — build vs. adopt?

Default to adopt. The CNCF Operator Hub has battle-tested operators for almost every domain (cert-manager, Strimzi, Crossplane, Argo, Flux, External Secrets). Building your own is justified only when:

The domain is genuinely yours and no off-the-shelf operator captures it.
You have engineers who’ll own controller-runtime / Kubebuilder for the operator’s lifetime — reconcile loops are not write-once code.
The alternative is a YAML pipeline so brittle that owning a reconciler is cheaper than maintaining it.

If you build: scope the CRD narrowly, version it from day one with a conversion webhook strategy, ship a Status sub-resource with proper conditions, and write the controller to be idempotent and tolerant of out-of-order events. Most homegrown operators die because nobody planned the v1 → v2 migration.

8. Where does the kube-scheduler get expensive at 500+ nodes, and what do you do?

Scheduler latency is dominated by the Filter and Score phases. Symptoms: scheduler_pending_pods rising, scheduler_scheduling_attempt_duration_seconds p99 over 1 s. Mitigations:

percentageOfNodesToScore: dropping from 100 to 50 nearly halves filter cost on large clusters — scheduler will stop scoring once it’s seen enough candidates.
Pod topology spread constraints with whenUnsatisfiable: ScheduleAnyway: hard constraints (DoNotSchedule) force exhaustive search.
Trim PreFilter plugins: custom scheduler plugins that hit external APIs are the silent killer. Cache aggressively or move to async.
Profile with the scheduler perf dashboard: --config with a single-profile KubeSchedulerConfiguration usually beats the default.

9. Zero-trust pod-to-pod — NetworkPolicies, mesh, or both?

Both, with separate jobs. NetworkPolicies enforce L3/L4 deny-by-default at the CNI layer — cheap, fast, in-kernel with Cilium eBPF. A service mesh (Istio, Linkerd, or Cilium Service Mesh) handles mTLS identity, authZ policy on L7 (HTTP verb, path, header), and zero-trust workload identity via SPIFFE.

The mistake: doing only the mesh. Mesh policy fails open if the sidecar crashes or is bypassed (hostNetwork pods, init containers). NetworkPolicy at the CNI is your floor. Then layer mesh policy for L7 and identity.

10. What’s an SLO you’d set for the cluster control plane itself, and how do you enforce it?

The conversation interviewers actually want:

API server availability: 99.9% on read, 99.5% on write, measured by apiserver_request_total by code class. Budget: 43 min/month of downtime.
Scheduling latency: p99 pod scheduling under 5 s for any pod that fits.
Node provisioning: p95 node Ready under 90 s with Karpenter, 180 s with Cluster Autoscaler.
Enforcement: burn-rate alerts (Google SRE workbook style) at 1 h and 6 h windows. Tie to a freeze: if you burn 25% of monthly budget in 1 h, automatic deploy freeze.

Quoting numbers is the easy half. The senior signal is owning the budget and the freeze.

What these questions test

The base CKA interview asks “can you run a cluster?” The senior loop asks “can you decide what the cluster should look like, and own it when it breaks at 3 a.m. with 500 nodes?” Every answer above pivots on a trade-off (CA vs Karpenter, mesh vs NetworkPolicy, build vs buy) and on instrumentation (which metric, what threshold, what action). Memorize the metric names — apiserver_flowcontrol_rejected_requests_total, scheduler_pending_pods, apiserver_admission_webhook_admission_duration_seconds. Senior interviewers screen on whether you reach for them unprompted.

Practice CKA questions right now — no signup

CertQuests has engineer-written CKA practice questions with full explanations on every answer. Free, no account required.

CKA — 5-Q quiz Base CKA interview prep Is the CKA worth it in 2026? Platform Engineer role profile

Frequently asked questions

Cluster Autoscaler vs Karpenter — when do you pick which in 2026?

Cluster Autoscaler scales pre-defined node groups; predictable and the only good option on GKE/AKS. Karpenter provisions nodes directly via the EC2 fleet API and usually wins on cost and startup latency on EKS. Pick CA when compliance pins instance families; pick Karpenter on EKS when cost matters and your team can own another controller.

How does API priority and fairness keep the API server stable at scale?

APF classifies requests via FlowSchemas into PriorityLevelConfigurations with a share of concurrency. A misbehaving controller is queued in its own flow without starving kubelet heartbeats or the scheduler. Watch apiserver_flowcontrol_rejected_requests_total and the X-Kubernetes-PF-FlowSchema-UID response header.

Are namespaces enough for multi-tenant isolation?

No. Namespaces are a naming scope, not a security boundary. Hard multi-tenancy needs separate node pools, sandboxed RuntimeClass, default-deny NetworkPolicies on an enforcing CNI, ResourceQuotas, no cluster-admin grants, and for hostile workloads — one cluster per tenant.

How do you cut Kubernetes cost 30% without losing reliability?

Measure first with OpenCost or Kubecost, then right-size requests (most pods over-request CPU 3–5x), bin-pack with Karpenter, run stateless workloads on spot with PDBs and preStop hooks, and commit Reserved or Savings Plans only on the baseline. 30–50% reduction is normal on a first pass.

How do you debug a pod stuck in ContainerCreating for 10 minutes?

Read kubectl describe pod Events first. Common causes at scale: image pull (private registry secret, Docker Hub rate limit), CNI IPAM exhaustion (AWS VPC CNI ENI pool), EBS attach across AZs, slow admission webhook, or a node still being provisioned. Check journalctl -u kubelet for what kubectl hides.

How we wrote this

No CNCF, Linux Foundation, or training-vendor revenue. Questions were sourced from senior platform and staff SRE interview reports on Reddit, Hacker News, the CNCF Slack #sig-scalability and #sig-autoscaling channels, and LinkedIn interview threads from 2025–2026, cross-referenced against the official Kubernetes architecture docs and the BLS Occupational Outlook for compensation context. Tell us what you’d update.

Last reviewed: June 2, 2026.