nodejs

Docker multi-stage builds: shrinking a Node.js image from 1 GB to under 200 MB

Phuong NguyenPhuong Nguyen4 min read
Cover image for "Docker multi-stage builds: shrinking a Node.js image from 1 GB to under 200 MB"

If you have ever run docker images after building a Node.js app and winced at a 900 MB image, multi-stage builds are the fix. They have been available since Docker 17.05, but a surprising number of projects still use a single FROM node:xx stage that drags in the entire Node.js toolchain, dev dependencies, and build cache.

This post walks through a real before-and-after: a typical Express API that goes from ~1 GB down to ~170 MB with no changes to the application code.

Why images get so large

A naive Dockerfile for a TypeScript Node.js app looks like this:

FROM node:20
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
RUN npm run build
CMD ["node", "dist/index.js"]

The problem: node:20 is a Debian-based image (~1 GB compressed). After npm ci, you now have both dependencies and devDependencies on disk. Even if you run npm prune --omit=dev later in the same layer, Docker already committed the large layer — the final image size reflects the union of all layers up to that point.

How multi-stage builds work

A multi-stage Dockerfile uses multiple FROM statements. Each FROM starts a new build stage. You can copy files between stages with COPY --from=<stage>. Only the last stage (or the one you target with --target) ends up in the final image.

The pattern for Node.js is:

  1. Build stage — use a full Node image, install all dependencies, compile TypeScript (or bundle with webpack/esbuild).
  2. Runtime stage — use a slim or distroless image, copy only the compiled output and production node_modules.

A concrete example

Here is the project structure we are working with:

my-api/
├── src/
│   └── index.ts
├── package.json
├── package-lock.json
└── tsconfig.json

src/index.ts is a simple Express server. devDependencies include typescript and @types/express; dependencies include only express.

Before: single-stage Dockerfile

FROM node:20
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
RUN npx tsc
CMD ["node", "dist/index.js"]

Build and check size:

docker build -t my-api:single .
docker images my-api:single
# REPOSITORY   TAG       SIZE
# my-api       single    1.08GB

After: multi-stage Dockerfile

# ── Stage 1: build ──────────────────────────────────────────────
FROM node:20 AS builder
WORKDIR /app

COPY package*.json ./
RUN npm ci

COPY tsconfig.json ./
COPY src ./src
RUN npx tsc --outDir dist

# Prune dev deps so we only carry production modules forward
RUN npm prune --omit=dev

# ── Stage 2: runtime ─────────────────────────────────────────────
FROM node:20-slim AS runtime
WORKDIR /app

# Copy only what the running app needs
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
COPY package.json ./

ENV NODE_ENV=production
CMD ["node", "dist/index.js"]

Build and compare:

docker build -t my-api:multi .
docker images my-api
# REPOSITORY   TAG      SIZE
# my-api       single   1.08GB
# my-api       multi    168MB

The runtime stage uses node:20-slim (~80 MB compressed) instead of the full Debian image, and it only receives the compiled dist/ directory and the already-pruned node_modules. The TypeScript compiler, source files, and dev packages never make it into the final image.

Going further with distroless

If you want to push the image even smaller and harden the attack surface, Google's distroless images remove the shell, package manager, and most OS utilities — leaving only the runtime:

FROM gcr.io/distroless/nodejs20-debian12 AS runtime
WORKDIR /app

COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
COPY package.json ./

CMD ["dist/index.js"]
Note: distroless images have no shell, so docker exec -it <container> bash will not work. Use docker logs and structured logging in your app instead. Keep a node:20-slim-based image for local debugging if needed.

Typical size with distroless: ~110–130 MB.

Caching layers efficiently

Docker builds layer by layer and reuses cached layers when the inputs have not changed. The order of COPY and RUN instructions matters a lot.

Good order (dependencies cached separately from source):

COPY package*.json ./
RUN npm ci          # ← only re-runs when package-lock.json changes
COPY src ./src
RUN npx tsc         # ← re-runs when source changes

Bad order (cache busted on every source change):

COPY . .            # copies everything including src/
RUN npm ci          # re-runs on any file change

Keep dependency installation before source copy in every stage.

Building only a specific stage

During development you might want to inspect the builder stage:

docker build --target builder -t my-api:debug .
docker run --rm -it my-api:debug sh

This is useful for checking what tsc compiled or what survived npm prune.

Using BuildKit for parallel stages

Enable BuildKit (default in Docker 23+) to build independent stages in parallel:

DOCKER_BUILDKIT=1 docker build -t my-api:multi .

If your Dockerfile has a test stage that runs alongside the builder stage (for example, running npm test before copying artifacts), BuildKit will execute both concurrently.

Summary

  • Use a two-stage Dockerfile: a builder stage for compiling and pruning, and a runtime stage based on node:20-slim or distroless.
  • Always COPY package*.json before COPY src so dependency layers are cached independently of source changes.
  • Use COPY --from=builder to bring only compiled output and production modules into the final image.
  • For the smallest image and minimal attack surface, switch the runtime stage to a distroless Node.js image.

The technique applies equally to other runtimes — Go, Python, Java — wherever there is a meaningful gap between the build environment and the production runtime.

Comments