Docker لـ Node.js: الإعداد الجاهز للإنتاج الذي لا يتحدث عنه أحد
بناء متعدد المراحل، ومستخدمون غير جذريين، وفحوصات صحية، وإدارة الأسرار، وتحسين حجم الصورة. أنماط Docker التي أستخدمها لكل نشر Node.js إنتاجي.
Most Node.js Dockerfiles in production are bad. Not "slightly suboptimal" bad. I mean running as root, shipping 600MB images with devDependencies baked in, no health checks, and secrets hardcoded in environment variables that anyone with docker inspect can read.
I know because I wrote those Dockerfiles. For years. They worked, so I never questioned them. Then one day a security audit flagged our container running as PID 1 root with write access to the entire filesystem, and I realized that "works" and "production-ready" are very different bars.
This is the Docker setup I now use for every Node.js project. It's not theoretical. It runs the services behind this site and several others I maintain. Every pattern here exists because I either got burned by the alternative or watched someone else get burned.
Why Your Current Dockerfile is Probably Wrong#
Let me guess what your Dockerfile looks like:
FROM node:20
WORKDIR /app
COPY . .
RUN npm install
EXPOSE 3000
CMD ["node", "server.js"]This is the "hello world" of Dockerfiles. It works. It also has at least five problems that will hurt you in production.
Running as Root#
By default, the node Docker image runs as root. That means your application process has root privileges inside the container. If someone exploits a vulnerability in your app — a path traversal bug, an SSRF, a dependency with a backdoor — they have root access to the container filesystem, can modify binaries, install packages, and potentially escalate further depending on your container runtime configuration.
"But containers are isolated!" Partially. Container escapes are real. CVE-2024-21626, CVE-2019-5736 — these are real-world container breakouts. Running as non-root is a defense-in-depth measure. It costs nothing and it closes an entire class of attacks.
Installing devDependencies in Production#
npm install without flags installs everything. Your test runners, linters, build tools, type checkers — all sitting in your production image. This bloats your image by hundreds of megabytes and increases your attack surface. Every additional package is another potential vulnerability that Trivy or Snyk will flag.
COPY Everything#
COPY . . copies your entire project directory into the image. That includes .git (which can be enormous), .env files (which contain secrets), node_modules (which you're about to reinstall anyway), test files, documentation, CI configs — everything.
No Health Checks#
Without a HEALTHCHECK instruction, Docker has no idea whether your application is actually serving traffic. The process could be running but deadlocked, out of memory, or stuck in an infinite loop. Docker will report the container as "running" because the process hasn't exited. Your load balancer keeps sending traffic to a zombie container.
No Layer Caching Strategy#
Copying everything before installing dependencies means that changing a single line of source code invalidates the npm install cache. Every build reinstalls all dependencies from scratch. On a project with heavy dependencies, that's 2-3 minutes of wasted time per build.
Let's fix all of this.
Multi-Stage Builds: The Single Biggest Win#
Multi-stage builds are the most impactful change you can make to a Node.js Dockerfile. The concept is simple: use one stage to build your application, then copy only the artifacts you need into a clean, minimal final image.
Here's the difference in practice:
# Single stage: ~600MB
FROM node:20
WORKDIR /app
COPY . .
RUN npm install
CMD ["node", "server.js"]
# Multi-stage: ~150MB
FROM node:20-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
RUN npm run build
FROM node:20-alpine AS runner
WORKDIR /app
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
COPY --from=builder /app/package.json ./
CMD ["node", "dist/server.js"]The builder stage has everything: full Node.js, npm, build tools, source code, devDependencies. The runner stage has only what's needed at runtime. The builder stage is discarded entirely — it doesn't end up in the final image.
Real Size Comparisons#
I measured these on an actual Express.js API project with about 40 dependencies:
| Approach | Image Size |
|---|---|
node:20 + npm install | 1.1 GB |
node:20-slim + npm install | 420 MB |
node:20-alpine + npm ci | 280 MB |
| Multi-stage + alpine + production deps only | 150 MB |
| Multi-stage + alpine + pruned deps | 95 MB |
That's a 10x reduction from the naive approach. Smaller images mean faster pulls, faster deployments, and less attack surface.
Why Alpine?#
Alpine Linux uses musl libc instead of glibc, and it doesn't include a package manager cache, documentation, or most utilities you'd find in a standard Linux distribution. The base node:20-alpine image is about 50MB compared to 350MB for node:20-slim and over 1GB for the full node:20.
The tradeoff is that some npm packages with native bindings (like bcrypt, sharp, canvas) need to be compiled against musl. In most cases this just works — npm will download the correct prebuilt binary. If you hit issues, you can install build dependencies in the builder stage:
FROM node:20-alpine AS builder
RUN apk add --no-cache python3 make g++
# ... rest of buildThese build tools only exist in the builder stage. They're not in your final image.
The Complete Production Dockerfile#
Here's the Dockerfile I use as a starting point for every Node.js project. Every line is intentional.
# ============================================
# Stage 1: Install dependencies
# ============================================
FROM node:20-alpine AS deps
# Security: create a working directory before anything else
WORKDIR /app
# Install dependencies based on lockfile
# Copy ONLY package files first — this is critical for layer caching
COPY package.json package-lock.json ./
# ci is better than install: it's faster, stricter, and reproducible
# --omit=dev excludes devDependencies from this stage
RUN npm ci --omit=dev
# ============================================
# Stage 2: Build the application
# ============================================
FROM node:20-alpine AS builder
WORKDIR /app
# Copy package files and install ALL dependencies (including dev)
COPY package.json package-lock.json ./
RUN npm ci
# NOW copy source code — changes here don't invalidate the npm ci cache
COPY . .
# Build the application (TypeScript compile, Next.js build, etc.)
RUN npm run build
# ============================================
# Stage 3: Production runner
# ============================================
FROM node:20-alpine AS runner
# Add labels for image metadata
LABEL maintainer="your-email@example.com"
LABEL org.opencontainers.image.source="https://github.com/yourorg/yourrepo"
# Security: install dumb-init for proper PID 1 signal handling
RUN apk add --no-cache dumb-init
# Security: set NODE_ENV before anything else
ENV NODE_ENV=production
# Security: use non-root user
# The node image already includes a 'node' user (uid 1000)
USER node
# Create app directory owned by node user
WORKDIR /app
# Copy production dependencies from deps stage
COPY --from=deps --chown=node:node /app/node_modules ./node_modules
# Copy built application from builder stage
COPY --from=deps --chown=node:node /app/package.json ./
COPY --from=builder --chown=node:node /app/dist ./dist
# Expose the port (documentation only — doesn't publish it)
EXPOSE 3000
# Health check: curl isn't available in alpine, use node
HEALTHCHECK --interval=30s --timeout=10s --start-period=40s --retries=3 \
CMD node -e "require('http').get('http://localhost:3000/health', (r) => { process.exit(r.statusCode === 200 ? 0 : 1) })"
# Use dumb-init as PID 1 to handle signals properly
ENTRYPOINT ["dumb-init", "--"]
# Start the application
CMD ["node", "dist/server.js"]Let me explain the parts that aren't obvious.
Why Three Stages Instead of Two?#
The deps stage installs only production dependencies. The builder stage installs everything (including devDependencies) and builds the app. The runner stage copies production deps from deps and built code from builder.
Why not install production deps in the builder stage? Because the builder stage has devDependencies mixed in. You'd have to run npm prune --production after the build, which is slower and less reliable than having a clean production dependency install.
Why dumb-init?#
When you run node server.js in a container, Node.js becomes PID 1. PID 1 has special behavior in Linux: it doesn't receive default signal handlers. If you send SIGTERM to the container (which is what docker stop does), Node.js as PID 1 won't handle it by default. Docker waits 10 seconds, then sends SIGKILL, which immediately terminates the process without any cleanup — no graceful shutdown, no closing database connections, no finishing in-flight requests.
dumb-init acts as PID 1 and properly forwards signals to your application. Your Node.js process receives SIGTERM as expected and can shut down gracefully:
// server.js
const server = app.listen(3000);
process.on('SIGTERM', () => {
console.log('SIGTERM received, shutting down gracefully');
server.close(() => {
console.log('HTTP server closed');
// Close database connections, flush logs, etc.
process.exit(0);
});
});An alternative is --init flag in docker run, but baking it into the image means it works regardless of how the container is started.
The .dockerignore File#
This is just as important as the Dockerfile itself. Without it, COPY . . sends everything to the Docker daemon:
# .dockerignore
node_modules
npm-debug.log*
.git
.gitignore
.env
.env.*
!.env.example
Dockerfile
docker-compose*.yml
.dockerignore
README.md
LICENSE
.github
.vscode
.idea
coverage
.nyc_output
*.test.ts
*.test.js
*.spec.ts
*.spec.js
__tests__
test
tests
docs
.husky
.eslintrc*
.prettierrc*
tsconfig.json
jest.config.*
vitest.config.*
Every file in .dockerignore is a file that won't be sent to the build context, won't end up in your image, and won't invalidate your layer cache when changed.
Layer Caching: Stop Waiting 3 Minutes Per Build#
Docker builds images in layers. Each instruction creates a layer. If a layer hasn't changed, Docker uses the cached version. But here's the critical detail: if a layer changes, all subsequent layers are invalidated.
This is why the order of instructions matters enormously.
The Wrong Order#
COPY . .
RUN npm ciEvery time you change any file — a single line in a single source file — Docker sees that the COPY . . layer changed. It invalidates that layer and everything after it, including npm ci. You reinstall all dependencies on every code change.
The Right Order#
COPY package.json package-lock.json ./
RUN npm ci
COPY . .Now npm ci only runs when package.json or package-lock.json changes. If you only changed source code, Docker reuses the cached npm ci layer. On a project with 500+ dependencies, this saves 60-120 seconds per build.
Cache Mount for npm#
Docker BuildKit supports cache mounts that persist the npm cache between builds:
RUN --mount=type=cache,target=/root/.npm \
npm ci --omit=devThis keeps the npm download cache across builds. If a dependency was already downloaded in a previous build, npm uses the cached version instead of downloading it again. This is especially useful in CI where you're building frequently.
To use BuildKit, set the environment variable:
DOCKER_BUILDKIT=1 docker build -t myapp .Or add to your Docker daemon configuration:
{
"features": {
"buildkit": true
}
}Using ARG for Cache Busting#
Sometimes you need to force a layer to rebuild. For example, if you're pulling a latest tag from a registry and want to ensure you get the newest version:
ARG CACHE_BUST=1
RUN npm ciBuild with a unique value to bust the cache:
docker build --build-arg CACHE_BUST=$(date +%s) -t myapp .Use this sparingly. The whole point of caching is speed — only bust the cache when you have a reason.
Secrets Management: Stop Putting Secrets in Your Dockerfile#
This is one of the most common and dangerous mistakes. I see it constantly:
# NEVER DO THIS
ENV DATABASE_URL=postgres://user:password@db:5432/myapp
ENV API_KEY=sk-live-abc123def456Environment variables set with ENV in a Dockerfile are baked into the image. Anyone who pulls the image can see them with docker inspect or docker history. They're also visible in every layer after they're set. Even if you unset them later, they exist in the layer history.
The Three Levels of Secrets#
1. Build-time secrets (Docker BuildKit)
If you need secrets during the build (like a private npm registry token), use BuildKit's --secret flag:
# syntax=docker/dockerfile:1
FROM node:20-alpine AS builder
WORKDIR /app
COPY package*.json ./
# Mount the secret at build time — it's never stored in the image
RUN --mount=type=secret,id=npmrc,target=/app/.npmrc \
npm ci
COPY . .
RUN npm run buildBuild with:
docker build --secret id=npmrc,src=$HOME/.npmrc -t myapp .The .npmrc file is available during the RUN command but is never committed to any image layer. It doesn't appear in docker history or docker inspect.
2. Runtime secrets via environment variables
For secrets your application needs at runtime, pass them when starting the container:
docker run -d \
-e DATABASE_URL="postgres://user:pass@db:5432/myapp" \
-e API_KEY="sk-live-abc123" \
myappOr with an env file:
docker run -d --env-file .env.production myappThese are visible via docker inspect on the running container, but they're not baked into the image. Anyone who pulls the image doesn't get the secrets.
3. Docker secrets (Swarm / Kubernetes)
For proper secret management in orchestrated environments:
# docker-compose.yml (Swarm mode)
version: "3.8"
services:
api:
image: myapp:latest
secrets:
- db_password
- api_key
secrets:
db_password:
external: true
api_key:
external: trueDocker mounts secrets as files at /run/secrets/<secret_name>. Your application reads them from the filesystem:
import { readFileSync } from "fs";
function getSecret(name) {
try {
return readFileSync(`/run/secrets/${name}`, "utf8").trim();
} catch {
// Fall back to environment variable for local development
return process.env[name.toUpperCase()];
}
}
const dbPassword = getSecret("db_password");This is the most secure approach because secrets never appear in environment variables, process listings, or container inspection output.
.env Files and Docker#
Never include .env files in your Docker image. Your .dockerignore should exclude them (which is why we listed .env and .env.* earlier). For local development with docker-compose, mount them at runtime:
services:
api:
env_file:
- .env.localHealth Checks: Let Docker Know Your App is Actually Working#
A health check tells Docker whether your application is functioning correctly. Without one, Docker only knows if the process is running — not if it's actually able to handle requests.
The HEALTHCHECK Instruction#
HEALTHCHECK --interval=30s --timeout=10s --start-period=40s --retries=3 \
CMD node -e "require('http').get('http://localhost:3000/health', (r) => { process.exit(r.statusCode === 200 ? 0 : 1) })"Let me break down the parameters:
--interval=30s: Check every 30 seconds--timeout=10s: If the check takes longer than 10 seconds, consider it failed--start-period=40s: Give the app 40 seconds to start before counting failures--retries=3: Mark unhealthy after 3 consecutive failures
Why Not Use curl?#
Alpine doesn't include curl by default. You could install it (apk add --no-cache curl), but that adds another binary to your minimal image. Using Node.js directly means zero additional dependencies.
For even lighter health checks, you can use a dedicated script:
// healthcheck.js
const http = require("http");
const options = {
hostname: "localhost",
port: 3000,
path: "/health",
timeout: 5000,
};
const req = http.request(options, (res) => {
process.exit(res.statusCode === 200 ? 0 : 1);
});
req.on("error", () => process.exit(1));
req.on("timeout", () => {
req.destroy();
process.exit(1);
});
req.end();COPY --chown=node:node healthcheck.js ./
HEALTHCHECK --interval=30s --timeout=10s --start-period=40s --retries=3 \
CMD ["node", "healthcheck.js"]The Health Endpoint#
Your application needs a /health endpoint for the check to hit. Don't just return 200 — actually verify your app is healthy:
app.get("/health", async (req, res) => {
const checks = {
uptime: process.uptime(),
timestamp: Date.now(),
status: "ok",
};
try {
// Check database connection
await db.query("SELECT 1");
checks.database = "connected";
} catch (err) {
checks.database = "disconnected";
checks.status = "degraded";
}
try {
// Check Redis connection
await redis.ping();
checks.redis = "connected";
} catch (err) {
checks.redis = "disconnected";
checks.status = "degraded";
}
const statusCode = checks.status === "ok" ? 200 : 503;
res.status(statusCode).json(checks);
});A "degraded" status with a 503 tells the orchestrator to stop routing traffic to this instance while it recovers, but doesn't necessarily trigger a restart.
Why Health Checks Matter for Orchestrators#
Docker Swarm, Kubernetes, and even plain docker-compose with restart: always use health checks to make decisions:
- Load balancers stop sending traffic to unhealthy containers
- Rolling updates wait for the new container to be healthy before stopping the old one
- Orchestrators can restart containers that become unhealthy
- Deployment pipelines can verify a deployment succeeded
Without health checks, a rolling deployment might kill the old container before the new one is ready, causing downtime.
docker-compose for Development#
Your development environment should be as close to production as possible, but with the convenience of hot reload, debuggers, and instant feedback. Here's the docker-compose setup I use for development:
# docker-compose.dev.yml
services:
app:
build:
context: .
dockerfile: Dockerfile.dev
args:
NODE_VERSION: "20"
ports:
- "3000:3000"
- "9229:9229" # Node.js debugger
volumes:
# Mount source code for hot reload
- .:/app
# Anonymous volume to preserve node_modules from the image
# This prevents the host's node_modules from overriding the container's
- /app/node_modules
environment:
- NODE_ENV=development
- DATABASE_URL=postgres://postgres:devpassword@db:5432/myapp_dev
- REDIS_URL=redis://redis:6379
env_file:
- .env.local
depends_on:
db:
condition: service_healthy
redis:
condition: service_started
command: npm run dev
db:
image: postgres:16-alpine
ports:
- "5432:5432"
environment:
POSTGRES_DB: myapp_dev
POSTGRES_USER: postgres
POSTGRES_PASSWORD: devpassword
volumes:
# Named volume for persistent data across container restarts
- pgdata:/var/lib/postgresql/data
# Initialization scripts
- ./scripts/init-db.sql:/docker-entrypoint-initdb.d/init.sql
healthcheck:
test: ["CMD-SHELL", "pg_isready -U postgres"]
interval: 10s
timeout: 5s
retries: 5
redis:
image: redis:7-alpine
ports:
- "6379:6379"
volumes:
- redisdata:/data
command: redis-server --appendonly yes
# Optional: database admin UI
adminer:
image: adminer:latest
ports:
- "8080:8080"
depends_on:
- db
volumes:
pgdata:
redisdata:Key Development Patterns#
Volume mounts for hot reload: The .:/app volume mount maps your local source code into the container. When you save a file, the change is immediately visible inside the container. Combined with a dev server that watches for changes (like nodemon or tsx --watch), you get instant feedback.
The node_modules trick: The anonymous volume - /app/node_modules ensures the container uses its own node_modules (installed during the image build) rather than your host's node_modules. This is critical because native modules compiled on macOS won't work inside a Linux container.
Service dependencies: depends_on with condition: service_healthy ensures the database is actually ready before your app tries to connect. Without the health check condition, depends_on only waits for the container to start — not for the service inside it to be ready.
Named volumes: pgdata and redisdata persist across container restarts. Without named volumes, you'd lose your database every time you run docker-compose down.
The Development Dockerfile#
Your development Dockerfile is simpler than production:
# Dockerfile.dev
ARG NODE_VERSION=20
FROM node:${NODE_VERSION}-alpine
WORKDIR /app
# Install all dependencies (including devDependencies)
COPY package*.json ./
RUN npm ci
# Source code is mounted via volume, not copied
# But we still need it for the initial build
COPY . .
EXPOSE 3000 9229
CMD ["npm", "run", "dev"]No multi-stage build, no production optimization. The goal is fast iteration, not small images.
Production Docker Compose#
Production docker-compose is a different beast. Here's what I use:
# docker-compose.prod.yml
services:
app:
image: ghcr.io/yourorg/myapp:${TAG:-latest}
restart: unless-stopped
ports:
- "3000:3000"
environment:
- NODE_ENV=production
env_file:
- .env.production
deploy:
resources:
limits:
cpus: "1.0"
memory: 512M
reservations:
cpus: "0.25"
memory: 128M
replicas: 2
healthcheck:
test: ["CMD", "node", "-e", "require('http').get('http://localhost:3000/health', r => process.exit(r.statusCode === 200 ? 0 : 1))"]
interval: 30s
timeout: 10s
start_period: 40s
retries: 3
logging:
driver: "json-file"
options:
max-size: "10m"
max-file: "3"
networks:
- internal
- web
depends_on:
db:
condition: service_healthy
redis:
condition: service_started
db:
image: postgres:16-alpine
restart: unless-stopped
volumes:
- pgdata:/var/lib/postgresql/data
environment:
POSTGRES_DB: myapp
POSTGRES_USER: ${DB_USER}
POSTGRES_PASSWORD: ${DB_PASSWORD}
deploy:
resources:
limits:
cpus: "1.0"
memory: 1G
healthcheck:
test: ["CMD-SHELL", "pg_isready -U ${DB_USER}"]
interval: 10s
timeout: 5s
retries: 5
networks:
- internal
logging:
driver: "json-file"
options:
max-size: "10m"
max-file: "3"
redis:
image: redis:7-alpine
restart: unless-stopped
command: >
redis-server
--appendonly yes
--maxmemory 256mb
--maxmemory-policy allkeys-lru
volumes:
- redisdata:/data
deploy:
resources:
limits:
cpus: "0.5"
memory: 512M
networks:
- internal
logging:
driver: "json-file"
options:
max-size: "5m"
max-file: "3"
volumes:
pgdata:
driver: local
redisdata:
driver: local
networks:
internal:
driver: bridge
web:
external: trueWhat's Different From Development#
Restart policy: unless-stopped restarts the container automatically if it crashes, unless you explicitly stopped it. This handles the "3 AM crash" scenario. The alternative always would also restart containers you intentionally stopped, which is usually not what you want.
Resource limits: Without limits, a memory leak in your Node.js app will consume all available RAM on the host, potentially killing other containers or the host itself. Set limits based on your application's actual usage plus some headroom:
# Monitor actual usage to set appropriate limits
docker stats --format "table {{.Name}}\t{{.CPUPerc}}\t{{.MemUsage}}"Logging configuration: Without max-size and max-file, Docker logs grow unbounded. I've seen production servers run out of disk space because of Docker logs. json-file with rotation is the simplest solution. For centralized logging, swap to fluentd or gelf driver:
logging:
driver: "fluentd"
options:
fluentd-address: "localhost:24224"
tag: "myapp.{{.Name}}"Network isolation: The internal network is only accessible to services in this compose stack. The database and Redis are not exposed to the host or other containers. Only the app service is connected to the web network, which your reverse proxy (Nginx, Traefik) uses to route traffic.
No port mapping for databases: Notice that db and redis don't have ports in the production config. They're only accessible via the internal Docker network. In development, we expose them so we can use local tools (pgAdmin, Redis Insight). In production, there's no reason for them to be accessible from outside the Docker network.
Next.js Specific: The Standalone Output#
Next.js has a built-in Docker optimization that many people don't know about: standalone output mode. It traces your application's imports and copies only the files needed to run — no node_modules required (dependencies are bundled).
Enable it in next.config.ts:
// next.config.ts
import type { NextConfig } from "next";
const nextConfig: NextConfig = {
output: "standalone",
};
export default nextConfig;This changes the build output dramatically. Instead of needing the entire node_modules directory, Next.js produces a self-contained server.js in .next/standalone/ that includes only the dependencies it actually uses.
The Next.js Production Dockerfile#
This is the Dockerfile I use for Next.js projects, based on the official Vercel example but with security hardening:
# ============================================
# Stage 1: Install dependencies
# ============================================
FROM node:20-alpine AS deps
RUN apk add --no-cache libc6-compat
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm ci
# ============================================
# Stage 2: Build the application
# ============================================
FROM node:20-alpine AS builder
WORKDIR /app
COPY --from=deps /app/node_modules ./node_modules
COPY . .
# Disable Next.js telemetry during build
ENV NEXT_TELEMETRY_DISABLED=1
RUN npm run build
# ============================================
# Stage 3: Production runner
# ============================================
FROM node:20-alpine AS runner
WORKDIR /app
RUN apk add --no-cache dumb-init
ENV NODE_ENV=production
ENV NEXT_TELEMETRY_DISABLED=1
# Non-root user
RUN addgroup --system --gid 1001 nodejs
RUN adduser --system --uid 1001 nextjs
# Copy public assets
COPY --from=builder /app/public ./public
# Set up the standalone output directory
# Automatically leverages output traces to reduce image size
# https://nextjs.org/docs/advanced-features/output-file-tracing
RUN mkdir .next
RUN chown nextjs:nodejs .next
# Copy the standalone server and static files
COPY --from=builder --chown=nextjs:nodejs /app/.next/standalone ./
COPY --from=builder --chown=nextjs:nodejs /app/.next/static ./.next/static
USER nextjs
EXPOSE 3000
ENV PORT=3000
ENV HOSTNAME="0.0.0.0"
HEALTHCHECK --interval=30s --timeout=10s --start-period=40s --retries=3 \
CMD node -e "require('http').get('http://localhost:3000/api/health', (r) => { process.exit(r.statusCode === 200 ? 0 : 1) })"
ENTRYPOINT ["dumb-init", "--"]
CMD ["node", "server.js"]Size Comparison for Next.js#
| Approach | Image Size |
|---|---|
node:20 + full node_modules + .next | 1.4 GB |
node:20-alpine + full node_modules + .next | 600 MB |
node:20-alpine + standalone output | 120 MB |
The standalone output is transformative. A 1.4 GB image becomes 120 MB. Deploys that took 90 seconds to pull now take 10 seconds.
Static File Handling#
Next.js standalone mode doesn't include the public folder or the static assets from .next/static. You need to copy them explicitly (which we do in the Dockerfile above). In production, you typically want a CDN in front of these:
// next.config.ts
const nextConfig: NextConfig = {
output: "standalone",
assetPrefix: process.env.CDN_URL || undefined,
};If you're not using a CDN, Next.js serves static files directly. The standalone server handles this fine — you just need to make sure the files are in the right place (which our Dockerfile ensures).
Sharp for Image Optimization#
Next.js uses sharp for image optimization. In the Alpine-based production image, you need to make sure the correct binary is available:
# In the runner stage, before switching to non-root user
RUN apk add --no-cache --virtual .sharp-deps vips-devOr better, install it as a production dependency and let npm handle the platform-specific binary:
npm install sharpThe node:20-alpine image works with sharp's prebuilt linux-x64-musl binary. No special configuration needed in most cases.
Image Scanning and Security#
Building a small image with a non-root user is a good start, but it's not enough for serious production workloads. Here's how to go further.
Trivy: Scan Your Images#
Trivy is a comprehensive vulnerability scanner for container images. Run it in your CI pipeline:
# Install trivy
brew install aquasecurity/trivy/trivy # macOS
# or
apt-get install trivy # Debian/Ubuntu
# Scan your image
trivy image myapp:latestSample output:
myapp:latest (alpine 3.19.1)
=============================
Total: 0 (UNKNOWN: 0, LOW: 0, MEDIUM: 0, HIGH: 0, CRITICAL: 0)
Node.js (node_modules/package-lock.json)
=========================================
Total: 2 (UNKNOWN: 0, LOW: 0, MEDIUM: 1, HIGH: 1, CRITICAL: 0)
┌──────────────┬────────────────┬──────────┬────────┬───────────────┐
│ Library │ Vulnerability │ Severity │ Status │ Fixed Version │
├──────────────┼────────────────┼──────────┼────────┼───────────────┤
│ semver │ CVE-2022-25883 │ HIGH │ fixed │ 7.5.4 │
│ word-wrap │ CVE-2023-26115 │ MEDIUM │ fixed │ 1.2.4 │
└──────────────┴────────────────┴──────────┴────────┴───────────────┘
Integrate it in CI to fail builds on critical vulnerabilities:
# .github/workflows/docker.yml
- name: Build image
run: docker build -t myapp:${{ github.sha }} .
- name: Scan image
uses: aquasecurity/trivy-action@master
with:
image-ref: myapp:${{ github.sha }}
exit-code: 1
severity: CRITICAL,HIGH
ignore-unfixed: trueRead-Only Filesystem#
You can run containers with a read-only root filesystem. This prevents an attacker from modifying binaries, installing tools, or writing malicious scripts:
docker run --read-only \
--tmpfs /tmp \
--tmpfs /app/.next/cache \
myapp:latestThe --tmpfs mounts provide writable temporary directories where your application legitimately needs to write (temp files, caches). Everything else is read-only.
In docker-compose:
services:
app:
image: myapp:latest
read_only: true
tmpfs:
- /tmp
- /app/.next/cacheDrop All Capabilities#
Linux capabilities are fine-grained permissions that replace the all-or-nothing root model. By default, Docker containers get a subset of capabilities. You can drop all of them:
docker run --cap-drop=ALL myapp:latestIf your application needs to bind to a port below 1024, you'd need NET_BIND_SERVICE. But since we're using port 3000 with a non-root user, we don't need any capabilities:
services:
app:
image: myapp:latest
cap_drop:
- ALL
security_opt:
- no-new-privileges:trueno-new-privileges prevents the process from gaining additional privileges through setuid/setgid binaries. This is a defense-in-depth measure that costs nothing.
Pin Your Base Image Digest#
Instead of using node:20-alpine (which is a moving target), pin to a specific digest:
FROM node:20-alpine@sha256:abcdef123456...Get the digest with:
docker inspect --format='{{index .RepoDigests 0}}' node:20-alpineThis ensures your builds are 100% reproducible. The tradeoff is that you don't automatically get security patches to the base image. Use Dependabot or Renovate to automate digest updates:
# .github/dependabot.yml
version: 2
updates:
- package-ecosystem: docker
directory: "/"
schedule:
interval: weeklyCI/CD Integration: Putting It All Together#
Here's a complete GitHub Actions workflow that builds, scans, and pushes a Docker image:
# .github/workflows/docker.yml
name: Build and Push Docker Image
on:
push:
branches: [main]
pull_request:
branches: [main]
env:
REGISTRY: ghcr.io
IMAGE_NAME: ${{ github.repository }}
jobs:
build:
runs-on: ubuntu-latest
permissions:
contents: read
packages: write
security-events: write
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Log in to Container Registry
if: github.event_name != 'pull_request'
uses: docker/login-action@v3
with:
registry: ${{ env.REGISTRY }}
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Extract metadata
id: meta
uses: docker/metadata-action@v5
with:
images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}
tags: |
type=sha
type=ref,event=branch
type=semver,pattern={{version}}
- name: Build and push
uses: docker/build-push-action@v6
with:
context: .
push: ${{ github.event_name != 'pull_request' }}
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}
cache-from: type=gha
cache-to: type=gha,mode=max
- name: Scan image with Trivy
uses: aquasecurity/trivy-action@master
with:
image-ref: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:sha-${{ github.sha }}
exit-code: 1
severity: CRITICAL,HIGH
ignore-unfixed: true
- name: Upload Trivy results
if: always()
uses: github/codeql-action/upload-sarif@v3
with:
sarif_file: trivy-results.sarifBuildKit Cache in CI#
The cache-from: type=gha and cache-to: type=gha,mode=max lines use GitHub Actions cache as a Docker layer cache. This means your CI builds benefit from layer caching across runs. First build takes 5 minutes; subsequent builds with only code changes take 30 seconds.
Common Pitfalls and How to Avoid Them#
The node_modules Inside the Image vs Host Conflict#
If you volume-mount your project directory into a container (-v .:/app), the host's node_modules overrides the container's. Native modules compiled on macOS won't work in Linux. Always use the anonymous volume trick:
volumes:
- .:/app
- /app/node_modules # preserves container's node_modulesSIGTERM Handling in TypeScript Projects#
If you're running TypeScript with tsx or ts-node in development, signal handling works normally. But in production, if you're using the compiled JavaScript with node, make sure your compiled output preserves the signal handlers. Some build tools optimize away "unused" code.
Memory Limits and Node.js#
Node.js doesn't automatically respect Docker memory limits. If your container has a 512MB memory limit, Node.js will still try to use its default heap size (around 1.5 GB on 64-bit systems). Set the max old space size:
CMD ["node", "--max-old-space-size=384", "dist/server.js"]Leave about 25% headroom between the Node.js heap limit and the container memory limit for non-heap memory (buffers, native code, etc.).
Or use the automatic detection flag:
ENV NODE_OPTIONS="--max-old-space-size=384"Timezone Issues#
Alpine uses UTC by default. If your application depends on a specific timezone:
RUN apk add --no-cache tzdata
ENV TZ=America/New_YorkBut better: write timezone-agnostic code. Store everything in UTC. Convert to local time only at the presentation layer.
Build Arguments vs Environment Variables#
ARGis available only during build. It doesn't persist in the final image (unless you copy it toENV).ENVpersists in the image and is available at runtime.
# Build-time configuration
ARG NODE_VERSION=20
FROM node:${NODE_VERSION}-alpine
# Runtime configuration
ENV PORT=3000
# WRONG: This makes the secret visible in the image
ARG API_KEY
ENV API_KEY=${API_KEY}
# RIGHT: Pass secrets at runtime
# docker run -e API_KEY=secret myappMonitoring in Production#
Your Docker setup isn't complete without observability. Here's a minimal but effective monitoring stack:
# docker-compose.monitoring.yml
services:
prometheus:
image: prom/prometheus:latest
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml
- prometheus_data:/prometheus
ports:
- "9090:9090"
networks:
- internal
grafana:
image: grafana/grafana:latest
volumes:
- grafana_data:/var/lib/grafana
ports:
- "3001:3000"
environment:
- GF_SECURITY_ADMIN_PASSWORD=${GRAFANA_PASSWORD}
networks:
- internal
volumes:
prometheus_data:
grafana_data:Expose metrics from your Node.js app using prom-client:
import { collectDefaultMetrics, Registry, Histogram } from "prom-client";
const register = new Registry();
collectDefaultMetrics({ register });
const httpRequestDuration = new Histogram({
name: "http_request_duration_seconds",
help: "Duration of HTTP requests in seconds",
labelNames: ["method", "route", "status_code"],
buckets: [0.01, 0.05, 0.1, 0.5, 1, 5],
registers: [register],
});
// Middleware
app.use((req, res, next) => {
const end = httpRequestDuration.startTimer();
res.on("finish", () => {
end({ method: req.method, route: req.route?.path || req.path, status_code: res.statusCode });
});
next();
});
// Metrics endpoint
app.get("/metrics", async (req, res) => {
res.set("Content-Type", register.contentType);
res.end(await register.metrics());
});The Checklist#
Before you ship a containerized Node.js app to production, verify:
- Non-root user — Container runs as a non-root user
- Multi-stage build — devDependencies and build tools are not in the final image
- Alpine base — Using a minimal base image
- .dockerignore —
.git,.env,node_modules, tests excluded - Layer caching —
package.jsoncopied before source code - Health check — HEALTHCHECK instruction in Dockerfile
- Signal handling —
dumb-initor--initfor proper SIGTERM handling - No secrets in image — No
ENVwith sensitive values in Dockerfile - Resource limits — Memory and CPU limits set in compose/orchestrator
- Log rotation — Logging driver configured with max size
- Image scanning — Trivy or equivalent in CI pipeline
- Pinned versions — Base image and dependency versions pinned
- Memory limits —
--max-old-space-sizeset for Node.js heap
Most of these are one-time setup. Do it once, template it, and every new project starts with a production-ready container from day one.
Docker isn't complicated. But the gap between a "working" Dockerfile and a production-ready one is wider than most people think. The patterns in this guide close that gap. Use them, adapt them, and stop deploying root containers with 1GB images and no health checks. Your future self — the one getting paged at 3 AM — will thank you.