Senior vs Junior — The Same Dockerfile, Two Approaches

The scenario

You need to containerize a Node.js Express API. The app has a package.json, a handful of source files in src/, and a node_modules directory that weighs about 300 MB once installed. Both engineers get the same task: ship a working Docker image.

The junior Dockerfile

Junior — works, but ships to prod like this

FROM node:latest

WORKDIR /app

COPY . .

RUN npm install

EXPOSE 3000

CMD ["node", "src/index.js"]

This file runs. The CI pipeline goes green. The app starts. But six things are silently wrong with it.

The senior Dockerfile

Senior — production-ready from day one

# ── Stage 1: install dependencies ──────────────────────────────────────────────
FROM node:22-alpine AS builder

WORKDIR /app

# Copy manifests first — changes to src/ won't bust this layer
COPY package*.json ./

RUN npm ci --only=production

# ── Stage 2: lean runtime image ────────────────────────────────────────────────
FROM gcr.io/distroless/nodejs22-debian12

WORKDIR /app

COPY --from=builder /app/node_modules ./node_modules
COPY src/ ./src/

# Distroless ships with a built-in nonroot user (uid 65532)
USER nonroot

EXPOSE 3000

HEALTHCHECK --interval=30s --timeout=5s --start-period=10s --retries=3 \
  CMD ["/nodejs/bin/node", "-e", "require('http').get('http://localhost:3000/health', r => process.exit(r.statusCode === 200 ? 0 : 1))"]

CMD ["/app/src/index.js"]

The companion .dockerignore (which the senior always adds alongside the Dockerfile):

node_modules
.git
.gitignore
*.md
*.log
.env
.env.*
coverage/
.nyc_output/
test/

Decision by decision: what actually changed

Decision	Junior	Senior
Base image	`node:latest`	`node:22-alpine` + distroless runtime
Image size	~1.2 GB	~85 MB
Layer caching	Busted on every source change	`package*.json` copied first; deps layer only rebuilds when manifests change
Install command	`npm install`	`npm ci --only=production`
Multi-stage build	No — build tools ship to prod	Yes — only `node_modules` + source copied to runtime stage
User	root (uid 0)	`nonroot` (uid 65532)
.dockerignore	None — `node_modules` and `.env` copied into build context	Explicit ignore list; secrets and dev artefacts excluded
HEALTHCHECK	None	HTTP probe on `/health`
Version pinning	`:latest` changes silently	Exact major version + OS variant pinned
Reproducibility	Different packages installed each run	`npm ci` reads `package-lock.json` exactly

Why each difference matters

1. `node:latest` vs a pinned version

node:latest is a moving tag. Today it might resolve to Node 22; next month it quietly becomes Node 23 the moment the Docker Hub maintainers update it. Your CI builds the same Dockerfile and produces a different runtime. A pinned tag like node:22-alpine makes the base image an explicit dependency, just like a package version. You control when you upgrade.

2. Layer caching — copy manifests first

Docker rebuilds every layer from the first changed line downward. COPY . . copies everything, so changing a single comment in src/utils.js invalidates the layer and triggers a full npm install all over again — potentially minutes of wasted CI time. The senior’s pattern is deliberate: copy only package.json and package-lock.json first, install, then copy source. The 300 MB install layer now only rebuilds when dependencies actually change.

3. `npm install` vs `npm ci --only=production`

npm install reads package.json and may update package-lock.json to resolve ranges. Different runs can produce different packages. npm ci reads package-lock.json exactly, fails fast if it doesn’t exist or is out of sync, and wipes node_modules before installing — guaranteeing a clean, reproducible result every time. The --only=production flag drops all devDependencies (jest, eslint, typescript, etc.) from the final install, shaving hundreds of megabytes.

4. Multi-stage builds

Without multi-stage, every tool used to build the app also ships to production: npm, build scripts, test harnesses, and the full Debian package tree that comes with node:latest. Multi-stage solves this cleanly. Stage 1 (node:22-alpine) installs dependencies in a throw-away environment. Stage 2 copies only the artefacts that actually need to run — node_modules and source — into a minimal base. The build toolchain never touches production.

5. Distroless runtime image

Google’s gcr.io/distroless/nodejs22-debian12 image contains exactly Node.js and its runtime libraries, nothing else. No bash, no sh, no apt, no package manager, no coreutils. The attack surface for a compromised container drops to near zero because there is no shell to execute and no tool to download further payloads. This is one of the supply-chain hardening techniques tested on the CKS exam.

6. Running as non-root

By default, Docker containers run as uid 0 — root inside the container. If an attacker achieves remote code execution and the container escapes its namespace (possible without additional hardening), they land as root on the host. The distroless image ships a nonroot user with uid 65532. One USER nonroot instruction eliminates an entire class of privilege-escalation paths and satisfies every container security benchmark (CIS, NSA/CISA hardening guides, Kubernetes PSA restricted mode).

7. .dockerignore

Without a .dockerignore, COPY . . sends the entire build context to the Docker daemon: node_modules (300 MB of packages that will immediately be re-installed anyway), .git (your full commit history), .env files with secrets, and coverage reports. The build context transfer alone can add 30–60 seconds. The ignore file costs nothing to add and pays dividends on every build.

8. HEALTHCHECK

A HEALTHCHECK instruction tells Docker (and Kubernetes, via its liveness probe if you use the Docker health status) whether the container is actually serving traffic, not just whether the process is running. A Node.js app can hang in the event loop without crashing. Without a health check, the orchestrator has no way to know the difference and will keep routing traffic to a dead process.

The numbers at a glance

Image size: 1.2 GB → 85 MB (−93%)
Rebuild time after a source change: ~90 s → ~8 s (layer cache hit on deps)
Packages shipped to production: ~1,200 (full Debian + devDeps) → ~180 (production deps only, distroless)
Shell available in container: yes (attack surface) → no
Running as root: yes → no

Why this matters for your certification

If you’re studying for the Docker DCA, the CKA, or the CKS, container image hygiene comes up in multiple domains:

Docker DCA: Expect questions on multi-stage builds, layer caching order, image size optimization, and npm ci vs npm install reproducibility.
CKA: The HEALTHCHECK maps directly to Kubernetes liveness and readiness probes — understand what happens when neither is defined.
CKS — Supply Chain Security (20%): Distroless images, non-root containers, image scanning with Trivy, and cosign verification are all first-class exam topics. The distroless pattern eliminates the shell that most post-exploit toolkits depend on, which is exactly the threat model CKS tests you on.

A common exam question pattern: “A security scan flags that a container runs as root and ships with a full shell. Which two Dockerfile changes address both findings?” — answer: switch to a distroless base and add USER nonroot.

The junior Dockerfile is not wrong — it’s a valid starting point. The senior Dockerfile is what happens when you’ve debugged a 3 AM production incident caused by a stale :latest image, waited six minutes for a rebuild after fixing a typo, or found a .env file baked into an image on Docker Hub.