Docker for Node.js: वो Production-Ready Setup जिसके बारे में कोई नहीं बताता

ज़्यादातर production Node.js Dockerfiles bad हैं। "थोड़ा suboptimal" bad नहीं। Root के रूप में run हो रही, devDependencies baked-in 600MB images, कोई health checks नहीं, और secrets hardcoded environment variables में जो कोई भी docker inspect से पढ़ सकता है।

मुझे पता है क्योंकि मैंने वो Dockerfiles लिखीं। सालों तक। काम करती थीं, तो मैंने कभी सवाल नहीं किया। फिर एक दिन security audit ने flag किया कि container PID 1 root के रूप में पूरे filesystem पर write access के साथ run हो रहा है, और मुझे realize हुआ कि "काम करता है" और "production-ready" बहुत अलग bars हैं।

यह वो Docker setup है जो मैं अब हर Node.js project के लिए इस्तेमाल करता हूं। Theoretical नहीं है। यह इस site और कई और sites के पीछे services run करता है। हर pattern यहां इसलिए है क्योंकि मैंने या तो alternative से burn हुआ या किसी और को burn होते देखा।

आपकी Current Dockerfile शायद गलत है#

चलिए guess करता हूं आपकी Dockerfile कैसी दिखती है:

dockerfile

FROM node:20
WORKDIR /app
COPY . .
RUN npm install
EXPOSE 3000
CMD ["node", "server.js"]

यह Dockerfiles की "hello world" है। काम करती है। लेकिन इसमें कम से कम पांच problems हैं जो production में hurt करेंगी।

Root के रूप में Run होना#

Default रूप से, node Docker image root के रूप में run होती है। मतलब आपकी application process को container के अंदर root privileges हैं। अगर कोई आपकी app में vulnerability exploit करे — path traversal bug, SSRF, backdoor वाली dependency — उसे container filesystem पर root access मिल जाता है, binaries modify कर सकता है, packages install कर सकता है, और container runtime configuration के आधार पर और escalate कर सकता है।

"लेकिन containers isolated हैं!" Partially। Container escapes real हैं। CVE-2024-21626, CVE-2019-5736 — ये real-world container breakouts हैं। Non-root run करना defense-in-depth measure है। कुछ cost नहीं है और attacks की पूरी class band कर देता है।

Production में devDependencies Install करना#

npm install बिना flags के सब कुछ install करता है। Test runners, linters, build tools, type checkers — सब production image में बैठे हैं। Image सैकड़ों megabytes bloat हो जाती है और attack surface बढ़ जाती है। हर additional package एक और potential vulnerability है जो Trivy या Snyk flag करेगा।

Everything COPY करना#

COPY . . पूरी project directory image में copy करता है। इसमें .git (जो enormous हो सकता है), .env files (जिनमें secrets हैं), node_modules (जो वैसे भी reinstall होगा), test files, documentation, CI configs — सब कुछ शामिल है।

कोई Health Checks नहीं#

HEALTHCHECK instruction बिना, Docker को पता नहीं कि application actually traffic serve कर रही है। Process running हो सकता है लेकिन deadlocked, out of memory, या infinite loop में stuck। Docker container को "running" report करेगा क्योंकि process exit नहीं हुआ। Load balancer zombie container पर traffic भेजता रहेगा।

कोई Layer Caching Strategy नहीं#

Dependencies install करने से पहले सब कुछ copy करने का मतलब एक single line source code change npm install cache invalidate कर देता है। हर build scratch से सभी dependencies reinstall करता है। Heavy dependencies वाले project पर, यह per build 2-3 minutes wasted time है।

चलिए यह सब fix करते हैं।

Multi-Stage Builds: सबसे बड़ी जीत#

Multi-stage builds Node.js Dockerfile में सबसे impactful change हैं। Concept simple है: एक stage में application build करें, फिर सिर्फ ज़रूरी artifacts clean, minimal final image में copy करें।

Practice में difference यह है:

dockerfile

# Single stage: ~600MB
FROM node:20
WORKDIR /app
COPY . .
RUN npm install
CMD ["node", "server.js"]
 
# Multi-stage: ~150MB
FROM node:20-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
RUN npm run build
 
FROM node:20-alpine AS runner
WORKDIR /app
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
COPY --from=builder /app/package.json ./
CMD ["node", "dist/server.js"]

Builder stage में सब कुछ है: full Node.js, npm, build tools, source code, devDependencies। Runner stage में सिर्फ runtime पर ज़रूरी चीज़ें हैं। Builder stage पूरी तरह discard हो जाती है — final image में नहीं आती।

Real Size Comparisons#

मैंने ये actual Express.js API project पर measure किए जिसमें करीब 40 dependencies थीं:

Approach	Image Size
`node:20` + `npm install`	1.1 GB
`node:20-slim` + `npm install`	420 MB
`node:20-alpine` + `npm ci`	280 MB
Multi-stage + alpine + production deps only	150 MB
Multi-stage + alpine + pruned deps	95 MB

Naive approach से 10x reduction। Smaller images मतलब faster pulls, faster deployments, और कम attack surface।

Alpine क्यों?#

Alpine Linux glibc की बजाय musl libc इस्तेमाल करता है, और इसमें package manager cache, documentation, या standard Linux distribution में मिलने वाली ज़्यादातर utilities शामिल नहीं हैं। Base node:20-alpine image करीब 50MB है, node:20-slim के 350MB और full node:20 के 1GB+ से compared।

Tradeoff यह है कि कुछ npm packages जिनमें native bindings हैं (जैसे bcrypt, sharp, canvas) को musl के against compile होना चाहिए। ज़्यादातर cases में बस काम कर जाता है — npm correct prebuilt binary download कर लेता है। अगर issues आएं, builder stage में build dependencies install कर सकते हैं:

dockerfile

FROM node:20-alpine AS builder
RUN apk add --no-cache python3 make g++
# ... बाकी build

ये build tools सिर्फ builder stage में exist करते हैं। Final image में नहीं हैं।

Complete Production Dockerfile#

यह वो Dockerfile है जो मैं हर Node.js project के starting point के रूप में इस्तेमाल करता हूं। हर line intentional है।

dockerfile

# ============================================
# Stage 1: Dependencies install करें
# ============================================
FROM node:20-alpine AS deps
 
# Security: कुछ भी करने से पहले working directory create करें
WORKDIR /app
 
# Lockfile के आधार पर dependencies install करें
# पहले सिर्फ package files copy करें — layer caching के लिए critical
COPY package.json package-lock.json ./
 
# ci install से better है: faster, stricter, और reproducible
# --omit=dev इस stage से devDependencies exclude करता है
RUN npm ci --omit=dev
 
# ============================================
# Stage 2: Application build करें
# ============================================
FROM node:20-alpine AS builder
 
WORKDIR /app
 
# Package files copy करें और सभी dependencies install करें (dev सहित)
COPY package.json package-lock.json ./
RUN npm ci
 
# अब source code copy करें — यहां changes npm ci cache invalidate नहीं करतीं
COPY . .
 
# Application build करें (TypeScript compile, Next.js build, etc.)
RUN npm run build
 
# ============================================
# Stage 3: Production runner
# ============================================
FROM node:20-alpine AS runner
 
# Image metadata के लिए labels add करें
LABEL maintainer="your-email@example.com"
LABEL org.opencontainers.image.source="https://github.com/yourorg/yourrepo"
 
# Security: proper PID 1 signal handling के लिए dumb-init install करें
RUN apk add --no-cache dumb-init
 
# Security: कुछ भी करने से पहले NODE_ENV set करें
ENV NODE_ENV=production
 
# Security: non-root user इस्तेमाल करें
# Node image में पहले से 'node' user है (uid 1000)
USER node
 
# Node user owned app directory create करें
WORKDIR /app
 
# deps stage से production dependencies copy करें
COPY --from=deps --chown=node:node /app/node_modules ./node_modules
 
# builder stage से built application copy करें
COPY --from=deps --chown=node:node /app/package.json ./
COPY --from=builder --chown=node:node /app/dist ./dist
 
# Port expose करें (documentation only — publish नहीं करता)
EXPOSE 3000
 
# Health check: alpine में curl available नहीं, node इस्तेमाल करें
HEALTHCHECK --interval=30s --timeout=10s --start-period=40s --retries=3 \
  CMD node -e "require('http').get('http://localhost:3000/health', (r) => { process.exit(r.statusCode === 200 ? 0 : 1) })"
 
# Signals properly handle करने के लिए dumb-init PID 1 के रूप में
ENTRYPOINT ["dumb-init", "--"]
 
# Application start करें
CMD ["node", "dist/server.js"]

जो parts obvious नहीं हैं, उन्हें explain करता हूं।

दो की बजाय तीन Stages क्यों?#

deps stage सिर्फ production dependencies install करती है। builder stage सब install करती है (devDependencies सहित) और app build करती है। runner stage deps से production deps और builder से built code copy करती है।

Builder stage में production deps क्यों नहीं install? क्योंकि builder stage में devDependencies mixed हैं। Build के बाद npm prune --production run करना slower और less reliable है clean production dependency install से।

dumb-init क्यों?#

जब container में node server.js run करते हैं, Node.js PID 1 बन जाता है। Linux में PID 1 का special behavior है: default signal handlers receive नहीं करता। अगर container को SIGTERM भेजें (जो docker stop करता है), Node.js PID 1 के रूप में default से handle नहीं करेगा। Docker 10 seconds wait करता है, फिर SIGKILL भेजता है, जो बिना cleanup के process तुरंत terminate कर देता है — कोई graceful shutdown नहीं, database connections close नहीं, in-flight requests finish नहीं।

dumb-init PID 1 के रूप में act करता है और signals properly forward करता है। Node.js process SIGTERM expected तरीके से receive करता है और gracefully shut down कर सकता है:

javascript

// server.js
const server = app.listen(3000);
 
process.on('SIGTERM', () => {
  console.log('SIGTERM received, shutting down gracefully');
  server.close(() => {
    console.log('HTTP server closed');
    // Database connections close करें, logs flush करें, etc.
    process.exit(0);
  });
});

Alternative है docker run में --init flag, लेकिन image में bake करने का मतलब container चाहे कैसे भी start हो, काम करेगा।

.dockerignore File#

यह Dockerfile जितनी ही important है। इसके बिना, COPY . . सब कुछ Docker daemon को भेज देता है:

# .dockerignore
node_modules
npm-debug.log*
.git
.gitignore
.env
.env.*
!.env.example
Dockerfile
docker-compose*.yml
.dockerignore
README.md
LICENSE
.github
.vscode
.idea
coverage
.nyc_output
*.test.ts
*.test.js
*.spec.ts
*.spec.js
__tests__
test
tests
docs
.husky
.eslintrc*
.prettierrc*
tsconfig.json
jest.config.*
vitest.config.*

.dockerignore में हर file वो file है जो build context में नहीं भेजी जाएगी, image में नहीं आएगी, और change होने पर layer cache invalidate नहीं करेगी।

Layer Caching: Per Build 3 Minutes Wait करना बंद करें#

Docker images layers में build करता है। हर instruction एक layer create करती है। अगर layer नहीं बदली, Docker cached version इस्तेमाल करता है। लेकिन critical detail: अगर कोई layer बदली, उसके बाद सभी layers invalidate हो जाती हैं।

इसलिए instructions का order बहुत matter करता है।

गलत Order#

dockerfile

COPY . .
RUN npm ci

हर बार कोई भी file change करें — single source file में single line — Docker देखता है COPY . . layer बदल गई। वो layer और उसके बाद सब invalidate, npm ci सहित। हर code change पर सभी dependencies reinstall।

सही Order#

dockerfile

COPY package.json package-lock.json ./
RUN npm ci
COPY . .

अब npm ci सिर्फ तब run होता है जब package.json या package-lock.json बदले। सिर्फ source code बदला तो Docker cached npm ci layer reuse करता है। 500+ dependencies वाले project पर, per build 60-120 seconds बचते हैं।

npm के लिए Cache Mount#

Docker BuildKit cache mounts support करता है जो builds के बीच npm cache persist करती हैं:

dockerfile

RUN --mount=type=cache,target=/root/.npm \
    npm ci --omit=dev

यह builds के बीच npm download cache रखता है। अगर dependency पहले किसी build में download हो चुकी है, npm download की बजाय cached version इस्तेमाल करता है। CI में खासतौर पर useful जहां frequently build हो रहा है।

BuildKit इस्तेमाल करने के लिए, environment variable set करें:

bash

DOCKER_BUILDKIT=1 docker build -t myapp .

या Docker daemon configuration में add करें:

json

{
  "features": {
    "buildkit": true
  }
}

Cache Busting के लिए ARG#

कभी-कभी layer force rebuild करनी होती है। जैसे, अगर registry से latest tag pull कर रहे हैं और newest version ensure करना है:

dockerfile

ARG CACHE_BUST=1
RUN npm ci

Cache bust करने के लिए unique value से build करें:

bash

docker build --build-arg CACHE_BUST=$(date +%s) -t myapp .

इसे sparingly इस्तेमाल करें। Caching का point speed है — cache तभी bust करें जब reason हो।

Secrets Management: Dockerfile में Secrets रखना बंद करें#

यह सबसे common और dangerous mistakes में से एक है। मैं लगातार देखता हूं:

dockerfile

# कभी ऐसा मत करें
ENV DATABASE_URL=postgres://user:password@db:5432/myapp
ENV API_KEY=sk-live-abc123def456

Dockerfile में ENV से set किए environment variables image में bake हो जाते हैं। Image pull करने वाला कोई भी docker inspect या docker history से देख सकता है। Set होने के बाद हर layer में visible होते हैं। बाद में unset करें तब भी, layer history में exist करते हैं।

Secrets के तीन Levels#

1. Build-time secrets (Docker BuildKit)

अगर build के दौरान secrets चाहिए (जैसे private npm registry token), BuildKit के --secret flag इस्तेमाल करें:

dockerfile

# syntax=docker/dockerfile:1
FROM node:20-alpine AS builder
WORKDIR /app
COPY package*.json ./
 
# Build time पर secret mount करें — कभी image में store नहीं होता
RUN --mount=type=secret,id=npmrc,target=/app/.npmrc \
    npm ci
 
COPY . .
RUN npm run build

इससे build करें:

bash

docker build --secret id=npmrc,src=$HOME/.npmrc -t myapp .

.npmrc file RUN command के दौरान available है लेकिन कभी किसी image layer में commit नहीं होती। docker history या docker inspect में appear नहीं होती।

2. Environment variables से Runtime secrets

Application को runtime पर जो secrets चाहिए, container start करते समय pass करें:

bash

docker run -d \
  -e DATABASE_URL="postgres://user:pass@db:5432/myapp" \
  -e API_KEY="sk-live-abc123" \
  myapp

या env file के साथ:

bash

docker run -d --env-file .env.production myapp

ये running container पर docker inspect से visible हैं, लेकिन image में bake नहीं हैं। Image pull करने वाले को secrets नहीं मिलते।

3. Docker secrets (Swarm / Kubernetes)

Orchestrated environments में proper secret management के लिए:

yaml

# docker-compose.yml (Swarm mode)
version: "3.8"
services:
  api:
    image: myapp:latest
    secrets:
      - db_password
      - api_key
 
secrets:
  db_password:
    external: true
  api_key:
    external: true

Docker secrets files के रूप में /run/secrets/<secret_name> पर mount करता है। Application filesystem से read करती है:

javascript

import { readFileSync } from "fs";
 
function getSecret(name) {
  try {
    return readFileSync(`/run/secrets/${name}`, "utf8").trim();
  } catch {
    // Local development के लिए environment variable पर fall back
    return process.env[name.toUpperCase()];
  }
}
 
const dbPassword = getSecret("db_password");

यह सबसे secure approach है क्योंकि secrets कभी environment variables, process listings, या container inspection output में appear नहीं होते।

.env Files और Docker#

Docker image में कभी .env files include मत करें। .dockerignore उन्हें exclude करे (इसलिए हमने पहले .env और .env.* list किया)। docker-compose के साथ local development के लिए, runtime पर mount करें:

yaml

services:
  api:
    env_file:
      - .env.local

Health Checks: Docker को बताएं App वाकई काम कर रही है#

Health check Docker को बताता है कि application correctly function कर रही है। इसके बिना, Docker सिर्फ जानता है process running है — actually requests handle कर पा रही है या नहीं।

HEALTHCHECK Instruction#

dockerfile

HEALTHCHECK --interval=30s --timeout=10s --start-period=40s --retries=3 \
  CMD node -e "require('http').get('http://localhost:3000/health', (r) => { process.exit(r.statusCode === 200 ? 0 : 1) })"

Parameters breakdown:

--interval=30s: हर 30 seconds check करें
--timeout=10s: Check 10 seconds से ज़्यादा ले तो failed मानें
--start-period=40s: Failures count करने से पहले app को 40 seconds start होने दें
--retries=3: 3 consecutive failures के बाद unhealthy mark करें

curl क्यों नहीं?#

Alpine में default से curl नहीं होता। Install कर सकते हैं (apk add --no-cache curl), लेकिन minimal image में एक और binary add होती है। Node.js directly इस्तेमाल करने से zero additional dependencies।

और भी lighter health checks के लिए, dedicated script इस्तेमाल कर सकते हैं:

javascript

// healthcheck.js
const http = require("http");
 
const options = {
  hostname: "localhost",
  port: 3000,
  path: "/health",
  timeout: 5000,
};
 
const req = http.request(options, (res) => {
  process.exit(res.statusCode === 200 ? 0 : 1);
});
 
req.on("error", () => process.exit(1));
req.on("timeout", () => {
  req.destroy();
  process.exit(1);
});
 
req.end();

dockerfile

COPY --chown=node:node healthcheck.js ./
HEALTHCHECK --interval=30s --timeout=10s --start-period=40s --retries=3 \
  CMD ["node", "healthcheck.js"]

Health Endpoint#

Application को health check hit करने के लिए /health endpoint चाहिए। सिर्फ 200 return मत करें — actually verify करें app healthy है:

javascript

app.get("/health", async (req, res) => {
  const checks = {
    uptime: process.uptime(),
    timestamp: Date.now(),
    status: "ok",
  };
 
  try {
    // Database connection check करें
    await db.query("SELECT 1");
    checks.database = "connected";
  } catch (err) {
    checks.database = "disconnected";
    checks.status = "degraded";
  }
 
  try {
    // Redis connection check करें
    await redis.ping();
    checks.redis = "connected";
  } catch (err) {
    checks.redis = "disconnected";
    checks.status = "degraded";
  }
 
  const statusCode = checks.status === "ok" ? 200 : 503;
  res.status(statusCode).json(checks);
});

"degraded" status 503 के साथ orchestrator को बताता है इस instance पर traffic routing बंद करें जब तक recover न हो, लेकिन ज़रूरी नहीं कि restart trigger हो।

Orchestrators के लिए Health Checks क्यों Matter करती हैं#

Docker Swarm, Kubernetes, और plain docker-compose with restart: always भी health checks से decisions लेते हैं:

Load balancers unhealthy containers पर traffic भेजना बंद करते हैं
Rolling updates पुराना container stop करने से पहले नए container के healthy होने का wait करते हैं
Orchestrators unhealthy हुए containers restart कर सकते हैं
Deployment pipelines verify कर सकती हैं कि deployment succeed हुआ

Health checks बिना, rolling deployment नया container ready होने से पहले पुराना container kill कर सकता है, downtime cause करता है।

Development के लिए docker-compose#

Development environment production जितना possible हो उतना close होना चाहिए, लेकिन hot reload, debuggers, और instant feedback की convenience के साथ। यहां development के लिए docker-compose setup है:

yaml

# docker-compose.dev.yml
services:
  app:
    build:
      context: .
      dockerfile: Dockerfile.dev
      args:
        NODE_VERSION: "20"
    ports:
      - "3000:3000"
      - "9229:9229"   # Node.js debugger
    volumes:
      # Hot reload के लिए source code mount करें
      - .:/app
      # Image के node_modules preserve करने के लिए anonymous volume
      # Host के node_modules container के override करने से रोकता है
      - /app/node_modules
    environment:
      - NODE_ENV=development
      - DATABASE_URL=postgres://postgres:devpassword@db:5432/myapp_dev
      - REDIS_URL=redis://redis:6379
    env_file:
      - .env.local
    depends_on:
      db:
        condition: service_healthy
      redis:
        condition: service_started
    command: npm run dev
 
  db:
    image: postgres:16-alpine
    ports:
      - "5432:5432"
    environment:
      POSTGRES_DB: myapp_dev
      POSTGRES_USER: postgres
      POSTGRES_PASSWORD: devpassword
    volumes:
      # Container restarts में persistent data के लिए named volume
      - pgdata:/var/lib/postgresql/data
      # Initialization scripts
      - ./scripts/init-db.sql:/docker-entrypoint-initdb.d/init.sql
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U postgres"]
      interval: 10s
      timeout: 5s
      retries: 5
 
  redis:
    image: redis:7-alpine
    ports:
      - "6379:6379"
    volumes:
      - redisdata:/data
    command: redis-server --appendonly yes
 
  # Optional: database admin UI
  adminer:
    image: adminer:latest
    ports:
      - "8080:8080"
    depends_on:
      - db
 
volumes:
  pgdata:
  redisdata:

Key Development Patterns#

Hot reload के लिए Volume mounts: .:/app volume mount local source code container में map करता है। File save करें, change तुरंत container में visible। Changes watch करने वाले dev server (जैसे nodemon या tsx --watch) के साथ combine करें, instant feedback मिलता है।

node_modules trick: Anonymous volume - /app/node_modules ensure करता है कि container अपने node_modules (image build के दौरान installed) इस्तेमाल करे host के node_modules की बजाय। Critical है क्योंकि macOS पर compiled native modules Linux container में काम नहीं करते।

Service dependencies: depends_on with condition: service_healthy ensure करता है database actually ready हो इससे पहले app connect करने की कोशिश करे। Health check condition बिना, depends_on सिर्फ container start होने का wait करता है — अंदर service ready होने का नहीं।

Named volumes: pgdata और redisdata container restarts में persist रहते हैं। Named volumes बिना, हर docker-compose down पर database lose हो जाता।

Development Dockerfile#

Development Dockerfile production से simpler है:

dockerfile

# Dockerfile.dev
ARG NODE_VERSION=20
FROM node:${NODE_VERSION}-alpine
 
WORKDIR /app
 
# सभी dependencies install करें (devDependencies सहित)
COPY package*.json ./
RUN npm ci
 
# Source code volume से mount होता है, copy नहीं
# लेकिन initial build के लिए अभी भी चाहिए
COPY . .
 
EXPOSE 3000 9229
 
CMD ["npm", "run", "dev"]

कोई multi-stage build नहीं, कोई production optimization नहीं। Goal fast iteration है, small images नहीं।

Production Docker Compose#

Production docker-compose एक different beast है। यहां मेरा setup:

yaml

# docker-compose.prod.yml
services:
  app:
    image: ghcr.io/yourorg/myapp:${TAG:-latest}
    restart: unless-stopped
    ports:
      - "3000:3000"
    environment:
      - NODE_ENV=production
    env_file:
      - .env.production
    deploy:
      resources:
        limits:
          cpus: "1.0"
          memory: 512M
        reservations:
          cpus: "0.25"
          memory: 128M
      replicas: 2
    healthcheck:
      test: ["CMD", "node", "-e", "require('http').get('http://localhost:3000/health', r => process.exit(r.statusCode === 200 ? 0 : 1))"]
      interval: 30s
      timeout: 10s
      start_period: 40s
      retries: 3
    logging:
      driver: "json-file"
      options:
        max-size: "10m"
        max-file: "3"
    networks:
      - internal
      - web
    depends_on:
      db:
        condition: service_healthy
      redis:
        condition: service_started
 
  db:
    image: postgres:16-alpine
    restart: unless-stopped
    volumes:
      - pgdata:/var/lib/postgresql/data
    environment:
      POSTGRES_DB: myapp
      POSTGRES_USER: ${DB_USER}
      POSTGRES_PASSWORD: ${DB_PASSWORD}
    deploy:
      resources:
        limits:
          cpus: "1.0"
          memory: 1G
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U ${DB_USER}"]
      interval: 10s
      timeout: 5s
      retries: 5
    networks:
      - internal
    logging:
      driver: "json-file"
      options:
        max-size: "10m"
        max-file: "3"
 
  redis:
    image: redis:7-alpine
    restart: unless-stopped
    command: >
      redis-server
      --appendonly yes
      --maxmemory 256mb
      --maxmemory-policy allkeys-lru
    volumes:
      - redisdata:/data
    deploy:
      resources:
        limits:
          cpus: "0.5"
          memory: 512M
    networks:
      - internal
    logging:
      driver: "json-file"
      options:
        max-size: "5m"
        max-file: "3"
 
volumes:
  pgdata:
    driver: local
  redisdata:
    driver: local
 
networks:
  internal:
    driver: bridge
  web:
    external: true

Development से क्या अलग है#

Restart policy: unless-stopped crash होने पर container automatically restart करता है, जब तक explicitly stop न किया हो। "3 AM crash" scenario handle करता है। Alternative always intentionally stop किए containers भी restart करता, जो usually चाहिए नहीं।

Resource limits: Limits बिना, Node.js app में memory leak host की सारी RAM consume कर लेगी, potentially दूसरे containers या host kill कर देगी। Actual usage plus headroom के आधार पर limits set करें:

bash

# Appropriate limits set करने के लिए actual usage monitor करें
docker stats --format "table {{.Name}}\t{{.CPUPerc}}\t{{.MemUsage}}"

Logging configuration: max-size और max-file बिना, Docker logs unbounded grow करते हैं। मैंने production servers disk space Docker logs की वजह से run out होते देखे हैं। json-file rotation के साथ simplest solution है। Centralized logging के लिए, fluentd या gelf driver swap करें:

yaml

logging:
  driver: "fluentd"
  options:
    fluentd-address: "localhost:24224"
    tag: "myapp.{{.Name}}"

Network isolation: internal network सिर्फ इस compose stack की services को accessible है। Database और Redis host या दूसरे containers को exposed नहीं हैं। सिर्फ app service web network से connected है, जो reverse proxy (Nginx, Traefik) traffic route करने के लिए इस्तेमाल करता है।

Databases के लिए Port mapping नहीं: ध्यान दें db और redis के production config में ports नहीं हैं। वे सिर्फ internal Docker network से accessible हैं। Development में expose करते हैं ताकि local tools (pgAdmin, Redis Insight) इस्तेमाल कर सकें। Production में, Docker network के बाहर accessible होने का कोई reason नहीं।

Next.js Specific: Standalone Output#

Next.js में एक built-in Docker optimization है जो बहुत लोगों को नहीं पता: standalone output mode। यह application के imports trace करता है और सिर्फ run करने के लिए ज़रूरी files copy करता है — node_modules ज़रूरी नहीं (dependencies bundled होती हैं)।

next.config.ts में enable करें:

typescript

// next.config.ts
import type { NextConfig } from "next";
 
const nextConfig: NextConfig = {
  output: "standalone",
};
 
export default nextConfig;

यह build output dramatically change कर देता है। पूरी node_modules directory ज़रूरी होने की बजाय, Next.js .next/standalone/ में self-contained server.js produce करता है जिसमें सिर्फ actually इस्तेमाल होने वाली dependencies शामिल हैं।

Next.js Production Dockerfile#

यह वो Dockerfile है जो मैं Next.js projects के लिए इस्तेमाल करता हूं, official Vercel example पर based लेकिन security hardening के साथ:

dockerfile

# ============================================
# Stage 1: Dependencies install करें
# ============================================
FROM node:20-alpine AS deps
RUN apk add --no-cache libc6-compat
WORKDIR /app
 
COPY package.json package-lock.json ./
RUN npm ci
 
# ============================================
# Stage 2: Application build करें
# ============================================
FROM node:20-alpine AS builder
WORKDIR /app
 
COPY --from=deps /app/node_modules ./node_modules
COPY . .
 
# Build के दौरान Next.js telemetry disable करें
ENV NEXT_TELEMETRY_DISABLED=1
 
RUN npm run build
 
# ============================================
# Stage 3: Production runner
# ============================================
FROM node:20-alpine AS runner
WORKDIR /app
 
RUN apk add --no-cache dumb-init
 
ENV NODE_ENV=production
ENV NEXT_TELEMETRY_DISABLED=1
 
# Non-root user
RUN addgroup --system --gid 1001 nodejs
RUN adduser --system --uid 1001 nextjs
 
# Public assets copy करें
COPY --from=builder /app/public ./public
 
# Standalone output directory set up करें
# Image size reduce करने के लिए automatically output traces leverage करता है
RUN mkdir .next
RUN chown nextjs:nodejs .next
 
# Standalone server और static files copy करें
COPY --from=builder --chown=nextjs:nodejs /app/.next/standalone ./
COPY --from=builder --chown=nextjs:nodejs /app/.next/static ./.next/static
 
USER nextjs
 
EXPOSE 3000
 
ENV PORT=3000
ENV HOSTNAME="0.0.0.0"
 
HEALTHCHECK --interval=30s --timeout=10s --start-period=40s --retries=3 \
  CMD node -e "require('http').get('http://localhost:3000/api/health', (r) => { process.exit(r.statusCode === 200 ? 0 : 1) })"
 
ENTRYPOINT ["dumb-init", "--"]
CMD ["node", "server.js"]

Next.js Size Comparison#

Approach	Image Size
`node:20` + full `node_modules` + `.next`	1.4 GB
`node:20-alpine` + full `node_modules` + `.next`	600 MB
`node:20-alpine` + standalone output	120 MB

Standalone output transformative है। 1.4 GB image 120 MB बन जाती है। Deploys जो pull करने में 90 seconds लेते थे अब 10 seconds लेते हैं।

Static File Handling#

Next.js standalone mode public folder या .next/static से static assets include नहीं करता। Explicitly copy करना होगा (जो हम Dockerfile में ऊपर करते हैं)। Production में, typically इनके सामने CDN चाहिए:

typescript

// next.config.ts
const nextConfig: NextConfig = {
  output: "standalone",
  assetPrefix: process.env.CDN_URL || undefined,
};

CDN नहीं इस्तेमाल कर रहे तो, Next.js static files directly serve करता है। Standalone server ठीक handle करता है — बस files सही जगह होनी चाहिए (जो हमारी Dockerfile ensure करती है)।

Image Optimization के लिए Sharp#

Next.js image optimization के लिए sharp इस्तेमाल करता है। Alpine-based production image में, correct binary available होना ज़रूरी:

dockerfile

# Runner stage में, non-root user पर switch करने से पहले
RUN apk add --no-cache --virtual .sharp-deps vips-dev

या better, production dependency के रूप में install करें और npm platform-specific binary handle करने दे:

bash

npm install sharp

node:20-alpine image sharp के prebuilt linux-x64-musl binary के साथ काम करती है। ज़्यादातर cases में कोई special configuration ज़रूरी नहीं।

Image Scanning और Security#

Small image non-root user के साथ build करना अच्छी शुरुआत है, लेकिन serious production workloads के लिए काफी नहीं। और आगे कैसे जाएं:

Trivy: Images Scan करें#

Trivy container images के लिए comprehensive vulnerability scanner है। CI pipeline में run करें:

bash

# trivy install करें
brew install aquasecurity/trivy/trivy  # macOS
# या
apt-get install trivy  # Debian/Ubuntu
 
# Image scan करें
trivy image myapp:latest

CI में critical vulnerabilities पर builds fail करने के लिए integrate करें:

yaml

# .github/workflows/docker.yml
- name: Build image
  run: docker build -t myapp:${{ github.sha }} .
 
- name: Scan image
  uses: aquasecurity/trivy-action@master
  with:
    image-ref: myapp:${{ github.sha }}
    exit-code: 1
    severity: CRITICAL,HIGH
    ignore-unfixed: true

Read-Only Filesystem#

Containers read-only root filesystem के साथ run कर सकते हैं। Attacker को binaries modify, tools install, या malicious scripts write करने से रोकता है:

bash

docker run --read-only \
  --tmpfs /tmp \
  --tmpfs /app/.next/cache \
  myapp:latest

--tmpfs mounts writable temporary directories provide करती हैं जहां application legitimately write करना चाहती है (temp files, caches)। बाकी सब read-only।

docker-compose में:

yaml

services:
  app:
    image: myapp:latest
    read_only: true
    tmpfs:
      - /tmp
      - /app/.next/cache

सभी Capabilities Drop करें#

Linux capabilities fine-grained permissions हैं जो all-or-nothing root model replace करती हैं। Default रूप से, Docker containers को capabilities का subset मिलता है। सब drop कर सकते हैं:

bash

docker run --cap-drop=ALL myapp:latest

अगर application को 1024 से नीचे port bind करना हो, NET_BIND_SERVICE ज़रूरी होगा। लेकिन non-root user के साथ port 3000 इस्तेमाल कर रहे हैं, तो कोई capabilities ज़रूरी नहीं:

yaml

services:
  app:
    image: myapp:latest
    cap_drop:
      - ALL
    security_opt:
      - no-new-privileges:true

no-new-privileges process को setuid/setgid binaries से additional privileges gain करने से रोकता है। Defense-in-depth measure जो कुछ cost नहीं।

Base Image Digest Pin करें#

node:20-alpine (जो moving target है) इस्तेमाल करने की बजाय, specific digest pin करें:

dockerfile

FROM node:20-alpine@sha256:abcdef123456...

Digest get करें:

bash

docker inspect --format='{{index .RepoDigests 0}}' node:20-alpine

Builds 100% reproducible ensure होती हैं। Tradeoff यह है कि base image security patches automatically नहीं मिलते। Digest updates automate करने के लिए Dependabot या Renovate इस्तेमाल करें:

yaml

# .github/dependabot.yml
version: 2
updates:
  - package-ecosystem: docker
    directory: "/"
    schedule:
      interval: weekly

CI/CD Integration: सब एक साथ#

यहां complete GitHub Actions workflow है जो Docker image build, scan, और push करती है:

yaml

# .github/workflows/docker.yml
name: Build and Push Docker Image
 
on:
  push:
    branches: [main]
  pull_request:
    branches: [main]
 
env:
  REGISTRY: ghcr.io
  IMAGE_NAME: ${{ github.repository }}
 
jobs:
  build:
    runs-on: ubuntu-latest
    permissions:
      contents: read
      packages: write
      security-events: write
 
    steps:
      - name: Checkout
        uses: actions/checkout@v4
 
      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v3
 
      - name: Log in to Container Registry
        if: github.event_name != 'pull_request'
        uses: docker/login-action@v3
        with:
          registry: ${{ env.REGISTRY }}
          username: ${{ github.actor }}
          password: ${{ secrets.GITHUB_TOKEN }}
 
      - name: Extract metadata
        id: meta
        uses: docker/metadata-action@v5
        with:
          images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}
          tags: |
            type=sha
            type=ref,event=branch
            type=semver,pattern={{version}}
 
      - name: Build and push
        uses: docker/build-push-action@v6
        with:
          context: .
          push: ${{ github.event_name != 'pull_request' }}
          tags: ${{ steps.meta.outputs.tags }}
          labels: ${{ steps.meta.outputs.labels }}
          cache-from: type=gha
          cache-to: type=gha,mode=max
 
      - name: Scan image with Trivy
        uses: aquasecurity/trivy-action@master
        with:
          image-ref: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:sha-${{ github.sha }}
          exit-code: 1
          severity: CRITICAL,HIGH
          ignore-unfixed: true
 
      - name: Upload Trivy results
        if: always()
        uses: github/codeql-action/upload-sarif@v3
        with:
          sarif_file: trivy-results.sarif

CI में BuildKit Cache#

cache-from: type=gha और cache-to: type=gha,mode=max lines GitHub Actions cache Docker layer cache के रूप में इस्तेमाल करती हैं। CI builds across runs layer caching benefit करती हैं। First build 5 minutes लेता है; subsequent builds सिर्फ code changes के साथ 30 seconds लेती हैं।

Common Pitfalls और कैसे Avoid करें#

Image के अंदर node_modules vs Host Conflict#

अगर project directory container में volume-mount करते हैं (-v .:/app), host के node_modules container के override कर देते हैं। macOS पर compiled native modules Linux में काम नहीं करते। हमेशा anonymous volume trick इस्तेमाल करें:

yaml

volumes:
  - .:/app
  - /app/node_modules  # container के node_modules preserve करता है

TypeScript Projects में SIGTERM Handling#

अगर development में tsx या ts-node के साथ TypeScript run कर रहे हैं, signal handling normally काम करती है। लेकिन production में, compiled JavaScript node से run कर रहे हैं, तो ensure करें compiled output signal handlers preserve करता है। कुछ build tools "unused" code optimize away कर देते हैं।

Memory Limits और Node.js#

Node.js automatically Docker memory limits respect नहीं करता। अगर container पर 512MB memory limit है, Node.js अभी भी default heap size (64-bit systems पर करीब 1.5 GB) इस्तेमाल करने की कोशिश करेगा। Max old space size set करें:

dockerfile

CMD ["node", "--max-old-space-size=384", "dist/server.js"]

Node.js heap limit और container memory limit के बीच करीब 25% headroom रखें non-heap memory (buffers, native code, etc.) के लिए।

या automatic detection flag इस्तेमाल करें:

dockerfile

ENV NODE_OPTIONS="--max-old-space-size=384"

Timezone Issues#

Alpine default रूप से UTC इस्तेमाल करता है। अगर application specific timezone पर depend करती है:

dockerfile

RUN apk add --no-cache tzdata
ENV TZ=America/New_York

लेकिन better: timezone-agnostic code लिखें। सब UTC में store करें। Local time में सिर्फ presentation layer पर convert करें।

Build Arguments vs Environment Variables#

ARG सिर्फ build के दौरान available है। Final image में persist नहीं होता (जब तक ENV में copy न करें)।
ENV image में persist होता है और runtime पर available है।

dockerfile

# Build-time configuration
ARG NODE_VERSION=20
FROM node:${NODE_VERSION}-alpine
 
# Runtime configuration
ENV PORT=3000
 
# गलत: Secret image में visible हो जाता है
ARG API_KEY
ENV API_KEY=${API_KEY}
 
# सही: Runtime पर secrets pass करें
# docker run -e API_KEY=secret myapp

Production में Monitoring#

Docker setup observability बिना complete नहीं। यहां minimal लेकिन effective monitoring stack:

yaml

# docker-compose.monitoring.yml
services:
  prometheus:
    image: prom/prometheus:latest
    volumes:
      - ./prometheus.yml:/etc/prometheus/prometheus.yml
      - prometheus_data:/prometheus
    ports:
      - "9090:9090"
    networks:
      - internal
 
  grafana:
    image: grafana/grafana:latest
    volumes:
      - grafana_data:/var/lib/grafana
    ports:
      - "3001:3000"
    environment:
      - GF_SECURITY_ADMIN_PASSWORD=${GRAFANA_PASSWORD}
    networks:
      - internal
 
volumes:
  prometheus_data:
  grafana_data:

Node.js app से prom-client इस्तेमाल करके metrics expose करें:

javascript

import { collectDefaultMetrics, Registry, Histogram } from "prom-client";
 
const register = new Registry();
collectDefaultMetrics({ register });
 
const httpRequestDuration = new Histogram({
  name: "http_request_duration_seconds",
  help: "Duration of HTTP requests in seconds",
  labelNames: ["method", "route", "status_code"],
  buckets: [0.01, 0.05, 0.1, 0.5, 1, 5],
  registers: [register],
});
 
// Middleware
app.use((req, res, next) => {
  const end = httpRequestDuration.startTimer();
  res.on("finish", () => {
    end({ method: req.method, route: req.route?.path || req.path, status_code: res.statusCode });
  });
  next();
});
 
// Metrics endpoint
app.get("/metrics", async (req, res) => {
  res.set("Content-Type", register.contentType);
  res.end(await register.metrics());
});

Checklist#

Containerized Node.js app production में ship करने से पहले, verify करें:

इनमें से ज़्यादातर one-time setup हैं। एक बार करें, template बनाएं, और हर नया project day one से production-ready container से शुरू होता है।

Docker complicated नहीं है। लेकिन "working" Dockerfile और production-ready के बीच gap ज़्यादातर लोगों की सोच से wider है। इस guide के patterns वो gap close करते हैं। इस्तेमाल करें, adapt करें, और 1GB images root containers health checks बिना deploy करना बंद करें। आपका future self — जो 3 AM पर paged हो रहा है — thank करेगा।

आपकी Current Dockerfile शायद गलत है#

Root के रूप में Run होना#

Production में devDependencies Install करना#

Everything COPY करना#

कोई Health Checks नहीं#

कोई Layer Caching Strategy नहीं#

Multi-Stage Builds: सबसे बड़ी जीत#

Real Size Comparisons#

Alpine क्यों?#

Complete Production Dockerfile#

दो की बजाय तीन Stages क्यों?#

dumb-init क्यों?#

.dockerignore File#

Layer Caching: Per Build 3 Minutes Wait करना बंद करें#

गलत Order#

सही Order#

npm के लिए Cache Mount#

Cache Busting के लिए ARG#

Secrets Management: Dockerfile में Secrets रखना बंद करें#

Secrets के तीन Levels#

.env Files और Docker#

Health Checks: Docker को बताएं App वाकई काम कर रही है#

HEALTHCHECK Instruction#

curl क्यों नहीं?#

Health Endpoint#

Orchestrators के लिए Health Checks क्यों Matter करती हैं#

Development के लिए docker-compose#

Key Development Patterns#

Development Dockerfile#

Production Docker Compose#

Development से क्या अलग है#

Next.js Specific: Standalone Output#

Next.js Production Dockerfile#

Next.js Size Comparison#

Static File Handling#

Image Optimization के लिए Sharp#

Image Scanning और Security#

Trivy: Images Scan करें#

Read-Only Filesystem#

सभी Capabilities Drop करें#

Base Image Digest Pin करें#

CI/CD Integration: सब एक साथ#

CI में BuildKit Cache#

Common Pitfalls और कैसे Avoid करें#

Image के अंदर node_modules vs Host Conflict#

TypeScript Projects में SIGTERM Handling#

Memory Limits और Node.js#

Timezone Issues#

Build Arguments vs Environment Variables#

Production में Monitoring#

Checklist#

संबंधित पोस्ट

API Security Best Practices: वो Checklist जो मैं हर Project पर चलाता हूं

Production में Bun: क्या काम करता है, क्या नहीं, और क्या Surprise किया