I spent my first three months with Kubernetes convinced it was over-engineered nonsense designed by people who had never actually shipped an application. I had a Node.js API, a Postgres database, and a Redis cache. On a single VPS with Docker Compose, the whole thing deployed in 30 seconds. Then someone decided we needed to be "cloud-native," and suddenly I was drowning in YAML files, fighting with ingress controllers, and googling "CrashLoopBackOff" at 2 AM.

That was four years ago. Today I run multiple production workloads on Kubernetes and I genuinely appreciate what it does. But I also have strong opinions about when it is appropriate and when it is a spectacular waste of engineering time. This post is everything I wish someone had told me on day one — written for application developers, not cluster administrators.

If you are here because your team just adopted Kubernetes and you need to understand it fast, start at the top. If you already know the basics and want the debugging workflow, skip to the kubectl section.

Why Kubernetes Exists (The Real Reason)#

The official explanation involves "container orchestration" and "declarative desired state management." That is technically accurate and practically useless for understanding why you should care.

Here is the actual problem Kubernetes solves: you have containers, and you need them to run reliably across multiple machines without you manually managing each one.

Before Kubernetes, deploying a containerized application to multiple servers meant writing custom scripts. You SSH into a server, pull the new image, stop the old container, start the new one. If you have three servers, that is three SSH sessions. If one fails, you need to detect that and handle it yourself. If traffic spikes and you need more instances, you spin up a new server and repeat the process manually. If a container crashes at 3 AM, it stays down until someone notices and restarts it.

Kubernetes handles all of that. You tell it "I want five copies of this container running at all times" and it makes that happen. If one crashes, it restarts. If a server dies, it reschedules onto healthy servers. If traffic increases, it can scale up automatically. If you push a new version, it rolls out gradually, and if the new version is broken, it rolls back.

That is the value proposition. Not microservices. Not "cloud-native architecture." Not impressing anyone at a conference. Just: reliable, automated container management at scale.

The key question is whether you actually need that. I will come back to this at the end.

The Mental Model: Think Layers#

Kubernetes has a lot of concepts, but as an app developer you really interact with about six of them. I think of them as layers:

Pod — your running container(s)
Deployment — manages pods, handles rolling updates
Service — gives pods a stable network address
ConfigMap / Secret — external configuration
Ingress — routes external HTTP traffic to services
Namespace — logical isolation between workloads

That is it. There are dozens of other resource types, but these are the ones you will touch on a daily basis. Everything else is infrastructure that your platform team or cloud provider manages.

Pods: Not Just Containers#

A Pod is the smallest deployable unit in Kubernetes. It wraps one or more containers that share the same network namespace and storage volumes. In practice, most pods contain a single container. You will hear people use "pod" and "container" interchangeably, which is technically wrong but practically fine 95% of the time.

The important thing about pods is that they are ephemeral. Kubernetes can kill and recreate them at any moment. Your pod can be evicted because the node needs resources, because a new version is deploying, because the node is being drained for maintenance, or for a dozen other reasons. If your application cannot handle being stopped and restarted at any time, you will have a bad day.

This is the single most important Kubernetes concept for application developers: pods are cattle, not pets. Do not store state in them. Do not assume they have stable IP addresses. Do not assume they live on a particular server.

Here is what a pod spec looks like in isolation (you will almost never write this directly, but understanding it helps):

yaml

apiVersion: v1
kind: Pod
metadata:
  name: my-api
  labels:
    app: my-api
spec:
  containers:
    - name: my-api
      image: registry.example.com/my-api:1.2.3
      ports:
        - containerPort: 3000
      env:
        - name: NODE_ENV
          value: "production"
        - name: DATABASE_URL
          valueFrom:
            secretKeyRef:
              name: db-credentials
              key: connection-string

A few things to notice. The labels field is how Kubernetes connects things together. Services find pods through label selectors. Deployments manage pods through label selectors. Labels are the glue. Get them wrong and nothing connects.

The image field uses a specific tag (1.2.3). More on why this matters later.

Environment variables can be literal values or references to ConfigMaps and Secrets. This is how you keep configuration out of your container image.

Deployments: The Thing You Actually Create#

You almost never create pods directly. Instead, you create a Deployment, which creates a ReplicaSet, which creates the pods. I know — three levels of indirection for a running container. But this layering is what enables rolling updates, rollbacks, and scaling.

A Deployment is the resource you will interact with most. Here is a complete, production-ready example:

yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-api
  namespace: production
  labels:
    app: my-api
spec:
  replicas: 3
  selector:
    matchLabels:
      app: my-api
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 1
      maxSurge: 1
  template:
    metadata:
      labels:
        app: my-api
        version: "1.2.3"
    spec:
      containers:
        - name: my-api
          image: registry.example.com/my-api:1.2.3
          ports:
            - containerPort: 3000
          resources:
            requests:
              cpu: "100m"
              memory: "128Mi"
            limits:
              cpu: "500m"
              memory: "512Mi"
          livenessProbe:
            httpGet:
              path: /health
              port: 3000
            initialDelaySeconds: 10
            periodSeconds: 15
            failureThreshold: 3
          readinessProbe:
            httpGet:
              path: /ready
              port: 3000
            initialDelaySeconds: 5
            periodSeconds: 10
            failureThreshold: 3
          startupProbe:
            httpGet:
              path: /health
              port: 3000
            initialDelaySeconds: 0
            periodSeconds: 5
            failureThreshold: 30
          env:
            - name: NODE_ENV
              value: "production"
          envFrom:
            - configMapRef:
                name: my-api-config
            - secretRef:
                name: my-api-secrets
      terminationGracePeriodSeconds: 30

This is a lot of YAML, so let me break it down section by section.

Replicas and Selector#

yaml

replicas: 3
selector:
  matchLabels:
    app: my-api

"Run three copies of this pod." The selector tells the Deployment which pods belong to it. The matchLabels must match the labels in the pod template below. If they do not match, Kubernetes rejects the manifest. I have seen teams waste hours debugging because of a label mismatch.

Rolling Update Strategy#

yaml

strategy:
  type: RollingUpdate
  rollingUpdate:
    maxUnavailable: 1
    maxSurge: 1

When you push a new version, Kubernetes does not kill all three pods and start three new ones. That would cause downtime. Instead, it gradually replaces them. maxUnavailable: 1 means at most one old pod can be down during the update. maxSurge: 1 means it can temporarily create one extra pod (so four total) during the transition. This gives you zero-downtime deployments out of the box.

The defaults are fine for most workloads. If you have a very small replica count (two or less), you might want maxSurge: 1 and maxUnavailable: 0 to ensure you never lose capacity during a deploy.

Resource Requests and Limits#

This is where most people get burned. I will cover it in detail in its own section below.

Probes#

The three probe types (liveness, readiness, startup) are critical. They each get their own section too.

YAML Demystified: It Is Just Configuration#

A lot of the Kubernetes anxiety comes from the YAML. There is a lot of it, it is deeply nested, and a single indentation error breaks everything silently. Let me take the mystery out of it.

Every Kubernetes YAML file follows the same structure:

yaml

apiVersion: <group/version>  # Which API this resource belongs to
kind: <ResourceType>          # What you're creating
metadata:                     # Name, namespace, labels, annotations
  name: my-thing
  namespace: default
  labels:
    app: my-thing
spec:                         # The actual configuration (varies by kind)
  ...

That is it. Four top-level fields. apiVersion and kind identify the resource type. metadata names it and attaches labels. spec is the actual configuration, and this is the only part that varies between resource types.

The apiVersion is confusing at first. Here is a cheat sheet for the resources you will actually use:

Resource	apiVersion
Pod	`v1`
Service	`v1`
ConfigMap	`v1`
Secret	`v1`
Deployment	`apps/v1`
Ingress	`networking.k8s.io/v1`
HPA	`autoscaling/v2`
CronJob	`batch/v1`

You do not need to memorize these. Every kubectl explain output shows the apiVersion, and your IDE with the Kubernetes extension will autocomplete them.

One tip that saved me a lot of pain: use --- to separate multiple resources in a single file. You can define a Deployment, Service, and ConfigMap in one file:

yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-api
spec:
  # ...
---
apiVersion: v1
kind: Service
metadata:
  name: my-api
spec:
  # ...
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: my-api-config
data:
  # ...

Apply it all with one command: kubectl apply -f my-api.yaml. This keeps related resources together and makes it easy to review in a PR.

Services: Stable Networking for Ephemeral Pods#

Pods get IP addresses, but those addresses change every time a pod restarts. If your frontend needs to talk to your API, it cannot hardcode the pod IP. Services solve this.

A Service gives a stable DNS name and IP address to a set of pods. When you create a Service named my-api in the production namespace, any pod in the cluster can reach it at my-api.production.svc.cluster.local. Or just my-api if you are in the same namespace.

yaml

apiVersion: v1
kind: Service
metadata:
  name: my-api
  namespace: production
spec:
  selector:
    app: my-api
  ports:
    - port: 80
      targetPort: 3000
      protocol: TCP
  type: ClusterIP

The selector matches pods with the label app: my-api. The Service load-balances traffic across all matching pods. port: 80 is the port the Service listens on. targetPort: 3000 is the port your container is listening on.

There are three Service types that matter:

ClusterIP (default): Only accessible from inside the cluster. This is what you use for internal service-to-service communication. Your API talks to your database through a ClusterIP Service.

NodePort: Exposes the Service on a static port on every node. Traffic to <node-ip>:<node-port> gets forwarded to the Service. Useful for development and debugging, rarely used in production.

LoadBalancer: Creates an external load balancer (on cloud providers). This is the simplest way to expose a service to the internet, but it creates a separate load balancer for each Service, which gets expensive. For HTTP traffic, use an Ingress instead.

The fourth type, ExternalName, is a DNS alias for an external service. I have used it exactly twice in four years.

Ingress: HTTP Routing#

For HTTP/HTTPS traffic from the internet, you want an Ingress. It is like a reverse proxy configuration (think nginx) that routes based on hostname and path:

yaml

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: my-ingress
  namespace: production
  annotations:
    cert-manager.io/cluster-issuer: letsencrypt-prod
spec:
  ingressClassName: nginx
  tls:
    - hosts:
        - api.example.com
      secretName: api-tls
  rules:
    - host: api.example.com
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: my-api
                port:
                  number: 80

This routes all traffic for api.example.com to the my-api Service. The TLS section handles HTTPS certificates automatically via cert-manager (which you or your platform team installs separately). One Ingress resource can route multiple hosts and paths to different services, which is much cheaper than one LoadBalancer per service.

The ingressClassName specifies which Ingress Controller handles this resource. NGINX is the most common, but Traefik, HAProxy, and cloud-specific controllers (ALB on AWS, Cloud Load Balancer on GCP) are also popular. Your cluster needs an Ingress Controller installed for Ingress resources to do anything — this is a common gotcha for people setting up their first cluster.

ConfigMaps and Secrets: Externalize Everything#

Hardcoding configuration into your container image is a common mistake. It means you need a different image for each environment, or worse, you rebuild and redeploy to change a configuration value.

ConfigMaps store non-sensitive configuration as key-value pairs:

yaml

apiVersion: v1
kind: ConfigMap
metadata:
  name: my-api-config
  namespace: production
data:
  LOG_LEVEL: "info"
  CACHE_TTL: "3600"
  FEATURE_NEW_DASHBOARD: "true"
  MAX_UPLOAD_SIZE: "10485760"

Secrets store sensitive data (passwords, API keys, certificates). They are base64-encoded, not encrypted — this is an important distinction:

yaml

apiVersion: v1
kind: Secret
metadata:
  name: my-api-secrets
  namespace: production
type: Opaque
data:
  DATABASE_URL: cG9zdGdyZXM6Ly91c2VyOnBhc3NAZGIuZXhhbXBsZS5jb206NTQzMi9teWRi
  REDIS_PASSWORD: c3VwZXJzZWNyZXRwYXNzd29yZA==
  API_KEY: YWJjZGVmMTIzNDU2Nzg5MA==

The base64 thing is not encryption. Anyone with access to the cluster can decode those values. Base64 exists so you can store binary data (like TLS certificates), not for security. For actual secret management, look into solutions like HashiCorp Vault, AWS Secrets Manager with the External Secrets Operator, or Sealed Secrets.

I learned this the hard way when a junior developer committed a Secret manifest to our Git repo thinking the base64 encoding meant it was secure. It was not. Anyone who cloned the repo could run echo "cG9zdGdyZXM6..." | base64 -d and read the database password.

You can inject ConfigMaps and Secrets as environment variables or mount them as files:

yaml

# As environment variables (all keys at once)
envFrom:
  - configMapRef:
      name: my-api-config
  - secretRef:
      name: my-api-secrets
 
# As individual environment variables (pick specific keys)
env:
  - name: DB_HOST
    valueFrom:
      configMapKeyRef:
        name: my-api-config
        key: DB_HOST
  - name: DB_PASSWORD
    valueFrom:
      secretKeyRef:
        name: my-api-secrets
        key: DB_PASSWORD
 
# As mounted files
volumeMounts:
  - name: config-volume
    mountPath: /etc/config
    readOnly: true
volumes:
  - name: config-volume
    configMap:
      name: my-api-config

The file mount approach is useful when your application reads configuration from files (like nginx.conf or application.properties). The environment variable approach is more common for twelve-factor apps.

One thing that trips people up: updating a ConfigMap or Secret does not automatically restart the pods using it. If you change a ConfigMap, existing pods still see the old values. You need to trigger a rollout. The cleanest way is to include a hash of the config in the pod template annotation:

yaml

spec:
  template:
    metadata:
      annotations:
        checksum/config: "sha256-of-configmap-contents"

When the config changes, the annotation changes, which triggers a rolling update. Helm does this automatically with helm.sh/hook annotations. If you are not using Helm, tools like Reloader can watch ConfigMaps and restart pods automatically.

Health Checks: The Three Probes#

Health checks are not optional. Without them, Kubernetes has no way to know if your application is actually working. It only knows if the process is running. A deadlocked application that consumes 100% CPU but never responds to requests? Kubernetes thinks it is perfectly healthy.

There are three types of probes, and they serve different purposes.

Liveness Probe: "Is the process stuck?"#

The liveness probe answers one question: should Kubernetes restart this container? If the liveness probe fails, Kubernetes kills the container and starts a new one.

yaml

livenessProbe:
  httpGet:
    path: /health
    port: 3000
  initialDelaySeconds: 10
  periodSeconds: 15
  timeoutSeconds: 3
  failureThreshold: 3

This checks GET /health every 15 seconds. If three consecutive checks fail (45 seconds total), the container gets restarted.

Critical mistake I see constantly: making the liveness probe check downstream dependencies. Your /health endpoint should NOT check the database, Redis, or any external service. If your database goes down, the liveness probe fails, Kubernetes restarts your pod, the new pod also cannot reach the database, the probe fails again, and you are in a restart loop. Meanwhile your application could have served cached responses or returned a meaningful error message, but instead it is stuck in CrashLoopBackOff.

A good liveness endpoint:

javascript

app.get('/health', (req, res) => {
  // Can this process respond to HTTP? That's all we need to know.
  res.status(200).json({ status: 'ok' });
});

That is it. If the event loop is not blocked and the HTTP server can respond, the process is alive.

Readiness Probe: "Can it handle traffic?"#

The readiness probe answers a different question: should Kubernetes send traffic to this pod? If the readiness probe fails, the pod is removed from the Service's endpoint list. It does not get restarted — it just stops receiving traffic.

yaml

readinessProbe:
  httpGet:
    path: /ready
    port: 3000
  initialDelaySeconds: 5
  periodSeconds: 10
  timeoutSeconds: 3
  failureThreshold: 3

This is where you check dependencies:

javascript

app.get('/ready', async (req, res) => {
  try {
    // Can we reach the database?
    await db.query('SELECT 1');
    // Can we reach the cache?
    await redis.ping();
    res.status(200).json({ status: 'ready' });
  } catch (err) {
    res.status(503).json({ status: 'not ready', error: err.message });
  }
});

If the database goes down, the readiness probe fails, the pod stops receiving traffic, but it stays running. When the database comes back, the probe succeeds, and the pod starts receiving traffic again. No unnecessary restarts.

Startup Probe: "Is it still booting?"#

The startup probe is for applications that take a long time to start. Java applications with large classpath scanning, Node.js apps that run database migrations on startup, Python apps that load ML models into memory. Without a startup probe, the liveness probe might kill the container before it finishes starting up.

yaml

startupProbe:
  httpGet:
    path: /health
    port: 3000
  initialDelaySeconds: 0
  periodSeconds: 5
  failureThreshold: 30

This gives the application up to 150 seconds (30 failures * 5 seconds) to start. Once the startup probe succeeds, it never runs again, and the liveness and readiness probes take over.

Before the startup probe existed (it was added in Kubernetes 1.18), people used huge initialDelaySeconds on their liveness probes. That works but is fragile — if your app occasionally takes longer to start, the liveness probe kills it.

The Shutdown Side: Graceful Termination#

Probes handle the startup side. For the shutdown side, you need to handle SIGTERM properly.

When Kubernetes wants to stop a pod (during a rolling update, scale-down, or node drain), it sends SIGTERM to the container, then waits terminationGracePeriodSeconds (default 30 seconds). If the process is still running after that, it sends SIGKILL.

Your application needs to handle SIGTERM:

javascript

process.on('SIGTERM', async () => {
  console.log('SIGTERM received. Starting graceful shutdown...');
 
  // Stop accepting new connections
  server.close(async () => {
    // Finish processing in-flight requests
    // Close database connections
    await db.end();
    await redis.quit();
 
    console.log('Graceful shutdown complete.');
    process.exit(0);
  });
 
  // Force shutdown after 25 seconds (leave 5s buffer before SIGKILL)
  setTimeout(() => {
    console.error('Forced shutdown after timeout.');
    process.exit(1);
  }, 25000);
});

There is a subtle race condition here that bit me hard. When Kubernetes decides to stop a pod, two things happen simultaneously: it sends SIGTERM to the container AND it removes the pod from the Service endpoints. But the endpoint removal propagates asynchronously through the cluster. For a brief window, traffic can still arrive at a pod that is shutting down.

The solution is a preStop hook that adds a small delay:

yaml

lifecycle:
  preStop:
    exec:
      command: ["sleep", "5"]

This gives the endpoint update time to propagate before your application starts shutting down. Without this, you will see occasional 502 errors during deployments that are maddeningly difficult to reproduce.

Resource Requests and Limits: Get These Wrong and Bad Things Happen#

This is the section I wish I had read before my first production incident. Resource management in Kubernetes is simultaneously simple to understand and incredibly easy to get wrong.

Every container should specify resource requests and limits for CPU and memory.

yaml

resources:
  requests:
    cpu: "100m"
    memory: "128Mi"
  limits:
    cpu: "500m"
    memory: "512Mi"

Requests are what the scheduler uses to decide where to place the pod. "This container needs at least 100 millicores of CPU and 128 MiB of memory." The scheduler only places the pod on a node that has enough unrequested resources available.

Limits are the maximum the container can use. If it exceeds the memory limit, it gets OOM-killed. If it exceeds the CPU limit, it gets throttled (not killed — it just runs slower).

The units are important:

CPU: measured in millicores. 1000m = 1 full CPU core. 100m = 10% of a core.
Memory: measured in bytes. Mi = mebibytes (1024-based), M = megabytes (1000-based). Use Mi to match what tools like top and free report.

The Most Common Mistake: No Limits At All#

If you deploy without resource limits, a single misbehaving pod can consume all resources on a node, starving every other pod on that node. I have seen a memory leak in one service take down an entire node's worth of workloads because nothing was constraining it.

The second most common mistake is setting limits too low. Your Node.js app normally uses 200MB of memory, so you set the limit to 256MB. Then a spike in traffic causes it to process more concurrent requests, memory hits 260MB, and Kubernetes OOM-kills the pod. Under load. When you need it most.

My rule of thumb: set the memory limit to 2-3x what your application normally uses. Set the CPU limit to 3-5x the request. Then monitor actual usage with kubectl top pods or Prometheus metrics and adjust.

yaml

# A reasonable starting point for a Node.js API
resources:
  requests:
    cpu: "100m"
    memory: "256Mi"
  limits:
    cpu: "500m"
    memory: "768Mi"

CPU Throttling: The Silent Killer#

CPU limits cause throttling, and throttling causes latency spikes. This is well-documented but still catches people. Your application does not crash — it just gets slower. P99 latencies go up, timeouts increase, users complain about the app being "sluggish."

Some teams have started setting CPU requests but no CPU limits, letting pods burst as needed. This works well on clusters that are not packed, but can cause noisy-neighbor problems on busy clusters. There is no universally right answer here — it depends on your workload profile and cluster utilization.

Quality of Service Classes#

Kubernetes assigns each pod a QoS class based on its resource configuration:

Guaranteed: requests == limits for both CPU and memory. These pods are the last to be evicted.
Burstable: requests < limits. The most common configuration.
BestEffort: no requests or limits set at all. These pods are the first to be evicted when the node is under pressure.

For production workloads, you want Guaranteed or Burstable. BestEffort is only acceptable for batch jobs where you genuinely do not care if they get killed.

HPA: Automatic Horizontal Pod Autoscaling#

The Horizontal Pod Autoscaler (HPA) automatically adjusts the number of replicas based on observed metrics. The most common metric is CPU utilization:

yaml

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: my-api-hpa
  namespace: production
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: my-api
  minReplicas: 3
  maxReplicas: 20
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70
    - type: Resource
      resource:
        name: memory
        target:
          type: Utilization
          averageUtilization: 80
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
        - type: Percent
          value: 10
          periodSeconds: 60
    scaleUp:
      stabilizationWindowSeconds: 0
      policies:
        - type: Percent
          value: 100
          periodSeconds: 30
        - type: Pods
          value: 4
          periodSeconds: 60
      selectPolicy: Max

This HPA targets 70% average CPU utilization across all pods. When utilization exceeds 70%, it adds pods. When it drops below 70%, it removes pods. The behavior section controls how aggressively it scales — fast scale-up (immediately, doubling if needed), slow scale-down (wait 5 minutes, then remove at most 10% per minute).

The behavior section matters more than people think. Without it, the HPA uses defaults that can cause flapping — rapidly scaling up and down as the metric oscillates around the target. The stabilization window prevents premature scale-down, and the percentage-based policies prevent removing too many pods at once.

Important: HPA requires the Metrics Server to be installed in your cluster. Most managed Kubernetes services (EKS, GKE, AKS) install it by default. On bare-metal or self-managed clusters, you need to install it yourself.

For custom metrics (requests per second, queue depth, etc.), you need a metrics adapter like Prometheus Adapter. This is more complex to set up but lets you scale on the metrics that actually matter for your workload.

One gotcha: HPA and manual replica counts do not mix. If you have an HPA configured, do not set replicas in your Deployment manifest, or every kubectl apply will fight with the HPA. The Deployment tries to set 3 replicas, the HPA tries to set 7, the Deployment applies and resets to 3, the HPA scales back to 7. I spent an afternoon wondering why my pods kept bouncing between 3 and 7 before I figured this out.

Debugging with kubectl: The Workflow That Actually Works#

When something goes wrong (and it will), you need a systematic debugging approach. Here is the workflow I follow every time.

Step 1: What Is the Current State?#

bash

# See all pods in a namespace
kubectl get pods -n production
 
# See more details (node, IP, restart count)
kubectl get pods -n production -o wide
 
# See all resources related to your app
kubectl get all -n production -l app=my-api

The output tells you the current state:

NAME                      READY   STATUS    RESTARTS   AGE
my-api-7d4b8c6f5-abc12   1/1     Running   0          2d
my-api-7d4b8c6f5-def34   0/1     Running   3          15m
my-api-7d4b8c6f5-ghi56   1/1     Running   0          2d

That middle pod is interesting. READY 0/1 means the readiness probe is failing. RESTARTS 3 means it has been restarted three times (probably liveness probe failures before that). AGE 15m means it was recently created — maybe a rolling update is in progress.

Step 2: Describe the Problem Pod#

bash

kubectl describe pod my-api-7d4b8c6f5-def34 -n production

describe gives you everything: events, conditions, container state, resource usage, mounted volumes, environment variables (names only, not values). The Events section at the bottom is where the gold is:

Events:
  Type     Reason     Age                From               Message
  ----     ------     ----               ----               -------
  Normal   Scheduled  15m                default-scheduler  Assigned to node-3
  Normal   Pulling    15m                kubelet            Pulling image "registry.example.com/my-api:1.2.3"
  Normal   Pulled     14m                kubelet            Container image pulled
  Normal   Created    14m                kubelet            Created container my-api
  Normal   Started    14m                kubelet            Started container my-api
  Warning  Unhealthy  3m (x9 over 14m)   kubelet            Readiness probe failed: Get "http://10.244.2.15:3000/ready": dial tcp 10.244.2.15:3000: connect: connection refused
  Warning  BackOff    2m (x3 over 5m)    kubelet            Back-off restarting failed container

Now you know: the container starts but the readiness probe cannot connect on port 3000. The application is not listening on the expected port.

Step 3: Check the Logs#

bash

# Current container logs
kubectl logs my-api-7d4b8c6f5-def34 -n production
 
# Previous container's logs (before the restart)
kubectl logs my-api-7d4b8c6f5-def34 -n production --previous
 
# Follow logs in real-time
kubectl logs my-api-7d4b8c6f5-def34 -n production -f
 
# Last 100 lines
kubectl logs my-api-7d4b8c6f5-def34 -n production --tail=100
 
# Logs from all pods matching a label
kubectl logs -l app=my-api -n production --all-containers

The --previous flag is essential. When a container crashes and restarts, the current logs are from the new container. The crash output is in the previous container's logs.

Step 4: Get Inside the Container#

Sometimes logs are not enough and you need to poke around inside the container:

bash

# Open a shell in the running container
kubectl exec -it my-api-7d4b8c6f5-abc12 -n production -- /bin/sh
 
# Run a single command
kubectl exec my-api-7d4b8c6f5-abc12 -n production -- env
 
# Check if the app is listening
kubectl exec my-api-7d4b8c6f5-abc12 -n production -- netstat -tlnp
 
# Test connectivity to another service
kubectl exec my-api-7d4b8c6f5-abc12 -n production -- wget -qO- http://my-database:5432

Use /bin/sh instead of /bin/bash — many minimal container images do not include bash. If even sh is not available (distroless images), you can use ephemeral debug containers:

bash

kubectl debug -it my-api-7d4b8c6f5-abc12 -n production --image=busybox --target=my-api

Step 5: Port Forward for Local Testing#

Need to hit a Service or pod directly from your machine?

bash

# Forward local port 8080 to the Service's port 80
kubectl port-forward svc/my-api 8080:80 -n production
 
# Forward to a specific pod
kubectl port-forward my-api-7d4b8c6f5-abc12 8080:3000 -n production

Now you can curl http://localhost:8080 from your terminal and it hits the service inside the cluster. This is invaluable for debugging — you can test the service with your local tools without exposing it externally.

Step 6: Check Resource Usage#

bash

# CPU and memory usage per pod
kubectl top pods -n production
 
# CPU and memory usage per node
kubectl top nodes
 
# Detailed resource view for a specific pod
kubectl describe pod my-api-7d4b8c6f5-abc12 -n production | grep -A5 "Requests\|Limits"

Compare the actual usage with the configured limits. If actual memory usage is close to the limit, you might be heading for an OOM kill.

The Emergency Commands#

These are for when things are on fire:

bash

# Restart all pods in a deployment (rolling restart)
kubectl rollout restart deployment/my-api -n production
 
# Roll back to the previous version
kubectl rollout undo deployment/my-api -n production
 
# Roll back to a specific revision
kubectl rollout history deployment/my-api -n production
kubectl rollout undo deployment/my-api -n production --to-revision=5
 
# Scale to zero (emergency stop)
kubectl scale deployment/my-api -n production --replicas=0
 
# Scale back up
kubectl scale deployment/my-api -n production --replicas=3

rollout undo has saved me multiple times. It is instant and does not require rebuilding or redeploying anything.

Helm: The Package Manager You Will Eventually Use#

Helm is to Kubernetes what npm is to Node.js — a package manager and templating system. You will probably resist it at first because it adds another layer of abstraction. Then you will have the same Deployment YAML copied across 12 microservices with tiny variations, and you will understand why it exists.

A Helm chart is a directory with templates and a values file:

my-api-chart/
  Chart.yaml          # Chart metadata
  values.yaml         # Default configuration values
  templates/
    deployment.yaml   # Kubernetes manifest templates
    service.yaml
    ingress.yaml
    configmap.yaml
    hpa.yaml
    _helpers.tpl      # Template helper functions

The templates use Go templating to inject values:

yaml

# templates/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: {{ include "my-api.fullname" . }}
  labels:
    {{- include "my-api.labels" . | nindent 4 }}
spec:
  replicas: {{ .Values.replicaCount }}
  selector:
    matchLabels:
      {{- include "my-api.selectorLabels" . | nindent 6 }}
  template:
    spec:
      containers:
        - name: {{ .Chart.Name }}
          image: "{{ .Values.image.repository }}:{{ .Values.image.tag }}"
          resources:
            {{- toYaml .Values.resources | nindent 12 }}

And the values file provides the defaults:

yaml

# values.yaml
replicaCount: 3
 
image:
  repository: registry.example.com/my-api
  tag: "1.2.3"
 
resources:
  requests:
    cpu: 100m
    memory: 128Mi
  limits:
    cpu: 500m
    memory: 512Mi

You deploy with:

bash

# Install a chart
helm install my-api ./my-api-chart -n production
 
# Upgrade with new values
helm upgrade my-api ./my-api-chart -n production --set image.tag=1.2.4
 
# Upgrade with a values file for a specific environment
helm upgrade my-api ./my-api-chart -n production -f values-production.yaml
 
# Roll back
helm rollback my-api 1 -n production
 
# See what would change without applying
helm diff upgrade my-api ./my-api-chart -n production --set image.tag=1.2.4

The real power of Helm is environment-specific values files. You have one chart and multiple values files:

values-development.yaml   # 1 replica, low resources, debug logging
values-staging.yaml       # 2 replicas, medium resources
values-production.yaml    # 3+ replicas, high resources, HPA enabled

Same templates, different configuration per environment. The alternative is maintaining separate YAML files for each environment, which quickly becomes a maintenance burden with drift between environments.

I was slow to adopt Helm because the Go templating syntax is ugly. It is. But the alternative — manually maintained YAML across environments — is worse. If you do not like Helm's templating, look at Kustomize, which takes a different approach using patches and overlays instead of templates. It is built into kubectl (kubectl apply -k).

Namespaces: Lightweight Isolation#

Namespaces are logical partitions within a cluster. They do not provide security isolation (that requires network policies and RBAC), but they prevent name collisions and make it easy to manage resources per-team or per-environment.

bash

# Create a namespace
kubectl create namespace staging
 
# List all namespaces
kubectl get namespaces
 
# Set your default namespace (so you don't need -n every time)
kubectl config set-context --current --namespace=production

A typical namespace strategy:

production      # Live traffic
staging         # Pre-production testing
development     # Developer environments
monitoring      # Prometheus, Grafana, etc.
ingress-nginx   # Ingress controller
cert-manager    # Certificate management

Resource quotas prevent one namespace from hogging all cluster resources:

yaml

apiVersion: v1
kind: ResourceQuota
metadata:
  name: production-quota
  namespace: production
spec:
  hard:
    requests.cpu: "10"
    requests.memory: 20Gi
    limits.cpu: "20"
    limits.memory: 40Gi
    pods: "50"

This caps the production namespace at 50 pods, 10 cores of CPU requests, and 20 GiB of memory requests. If someone tries to deploy a pod that would exceed these limits, Kubernetes rejects it. This is your safety net against resource exhaustion.

Common Mistakes (I Made All of These)#

Mistake 1: Using the `latest` Tag#

yaml

# DO NOT DO THIS
image: registry.example.com/my-api:latest

The latest tag is mutable. It points to whatever was most recently pushed. This means:

Deployments are not reproducible. The same manifest deploys different code at different times.
Rollbacks do not work. "Roll back to the previous version" means nothing when the tag is latest.
Image pull policy defaults matter. With latest, Kubernetes defaults to Always pull, which can cause failed deployments if the registry is down.

Always use immutable tags: semantic versions (1.2.3), git SHAs (abc1234), or build numbers (build-567).

Mistake 2: No Resource Limits#

Already covered above. Just do it. Every container. No exceptions. A five-minute task that prevents hours of debugging resource contention issues.

Mistake 3: No Health Checks#

Also covered above. At minimum, add a liveness probe. Ideally, add all three probes. Your future self at 2 AM will thank you.

Mistake 4: Not Setting `terminationGracePeriodSeconds`#

The default is 30 seconds, which is fine for most applications. But if your application processes long-running requests (file uploads, report generation, WebSocket connections), 30 seconds might not be enough. Set it to match your longest expected request:

yaml

spec:
  terminationGracePeriodSeconds: 120

Mistake 5: Ignoring Pod Disruption Budgets#

A Pod Disruption Budget (PDB) tells Kubernetes how many pods can be down simultaneously during voluntary disruptions (node drains, cluster upgrades, etc.). Without a PDB, a node drain can kill all your pods at once.

yaml

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: my-api-pdb
  namespace: production
spec:
  minAvailable: 2
  selector:
    matchLabels:
      app: my-api

This ensures at least 2 pods are always running, even during maintenance. If you have 3 replicas, Kubernetes can only drain one pod at a time.

Mistake 6: One Giant Namespace#

Putting everything in the default namespace is fine for learning. For anything else, use namespaces. They cost nothing, make kubectl get pods readable, and enable per-team resource quotas.

Mistake 7: Not Using `kubectl diff`#

Before applying changes, check what will change:

bash

kubectl diff -f my-deployment.yaml

This shows you exactly what Kubernetes will modify, like a dry run. I have caught several near-misses with this — wrong namespace, wrong image tag, accidentally removed resource limits. It takes two seconds and can save you an outage.

Mistake 8: Skipping Network Policies#

By default, every pod can talk to every other pod in the cluster. If an attacker compromises your frontend pod, they can directly access your database pod. Network Policies restrict which pods can communicate:

yaml

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: my-api-network-policy
  namespace: production
spec:
  podSelector:
    matchLabels:
      app: my-api
  policyTypes:
    - Ingress
    - Egress
  ingress:
    - from:
        - podSelector:
            matchLabels:
              app: my-frontend
      ports:
        - port: 3000
  egress:
    - to:
        - podSelector:
            matchLabels:
              app: my-database
      ports:
        - port: 5432

This says: my-api can only receive traffic from my-frontend on port 3000, and can only send traffic to my-database on port 5432. Everything else is blocked.

When NOT to Use Kubernetes#

This is the section that Kubernetes evangelists skip and that I think is the most important.

Kubernetes has significant operational overhead. Even with managed services like EKS, GKE, or AKS, you are dealing with:

The learning curve (weeks to months for a team)
YAML management (hundreds of lines for a simple deployment)
Networking complexity (Services, Ingress, DNS, Network Policies)
Monitoring and observability requirements
Cost (managed K8s control planes are not cheap, plus the nodes)
Upgrade maintenance (Kubernetes versions deprecate fast)
Debugging difficulty (more layers = more places things can go wrong)

You probably do not need Kubernetes if:

You have fewer than five services
Your team is fewer than five developers
You do not need automatic scaling
Your traffic is predictable
A single server handles your load
You are a startup that has not found product-market fit yet

For a single application or a small set of services, here are alternatives that give you 80% of the benefit at 20% of the complexity:

Docker Compose + a single VPS: For small projects. Dead simple, cheap, easy to debug.
Docker Swarm: Multi-node container orchestration that is dramatically simpler than Kubernetes. Fewer features, but also fewer footguns.
Cloud Run / App Runner / Azure Container Apps: Managed container platforms that handle scaling automatically. No cluster to manage.
Platform-as-a-Service (Railway, Render, Fly.io): Push code, get a URL. These have gotten really good.

I run the services behind this site on a single VPS with PM2. No containers, no orchestration, no YAML files. It handles the traffic fine, deploys in seconds, and I can debug it by SSH-ing in. For my use case, Kubernetes would be absurd over-engineering.

The Honest Take#

Kubernetes is brilliant infrastructure software that solves real problems for organizations running many services at scale. It provides self-healing, automated scaling, zero-downtime deployments, and a consistent API across every cloud provider. For teams that need those capabilities, there is nothing else that compares.

But the industry has a Kubernetes problem. It has become the default answer to every deployment question, regardless of whether the complexity is justified. I have watched startups with two developers and one API spend months setting up Kubernetes when a $20 VPS with a deploy script would have served them for years.

The decision framework I use: if you are running fewer than ten services, if your team is small, if you are not doing multi-region or multi-cloud, if your scaling needs are modest and predictable — start simple. A VPS with Docker Compose. A managed container platform. A PaaS. You can always migrate to Kubernetes later when the scale justifies it. Migrating from Kubernetes back to something simpler is much harder because by then your entire deployment pipeline, monitoring stack, and team knowledge are built around it.

If you are joining a team that already uses Kubernetes, the concepts in this post will get you productive fast. Focus on Deployments, Services, ConfigMaps, probes, resource limits, and the kubectl debugging workflow. That covers 90% of what an application developer needs to know day-to-day. Leave the cluster administration — node management, CNI plugins, storage classes, custom controllers — to the platform team.

And set your resource limits. Seriously. Do it now.

Why Kubernetes Exists (The Real Reason)#

The official explanation involves "container orchestration" and "declarative desired state management." That is technically accurate and practically useless for understanding why you should care.

Here is the actual problem Kubernetes solves: you have containers, and you need them to run reliably across multiple machines without you manually managing each one.

That is the value proposition. Not microservices. Not "cloud-native architecture." Not impressing anyone at a conference. Just: reliable, automated container management at scale.

The key question is whether you actually need that. I will come back to this at the end.

The Mental Model: Think Layers#

Kubernetes has a lot of concepts, but as an app developer you really interact with about six of them. I think of them as layers:

Pod — your running container(s)
Deployment — manages pods, handles rolling updates
Service — gives pods a stable network address
ConfigMap / Secret — external configuration
Ingress — routes external HTTP traffic to services
Namespace — logical isolation between workloads

That is it. There are dozens of other resource types, but these are the ones you will touch on a daily basis. Everything else is infrastructure that your platform team or cloud provider manages.

Pods: Not Just Containers#

Here is what a pod spec looks like in isolation (you will almost never write this directly, but understanding it helps):

yaml

apiVersion: v1
kind: Pod
metadata:
  name: my-api
  labels:
    app: my-api
spec:
  containers:
    - name: my-api
      image: registry.example.com/my-api:1.2.3
      ports:
        - containerPort: 3000
      env:
        - name: NODE_ENV
          value: "production"
        - name: DATABASE_URL
          valueFrom:
            secretKeyRef:
              name: db-credentials
              key: connection-string

The image field uses a specific tag (1.2.3). More on why this matters later.

Environment variables can be literal values or references to ConfigMaps and Secrets. This is how you keep configuration out of your container image.

Deployments: The Thing You Actually Create#

A Deployment is the resource you will interact with most. Here is a complete, production-ready example:

yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-api
  namespace: production
  labels:
    app: my-api
spec:
  replicas: 3
  selector:
    matchLabels:
      app: my-api
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 1
      maxSurge: 1
  template:
    metadata:
      labels:
        app: my-api
        version: "1.2.3"
    spec:
      containers:
        - name: my-api
          image: registry.example.com/my-api:1.2.3
          ports:
            - containerPort: 3000
          resources:
            requests:
              cpu: "100m"
              memory: "128Mi"
            limits:
              cpu: "500m"
              memory: "512Mi"
          livenessProbe:
            httpGet:
              path: /health
              port: 3000
            initialDelaySeconds: 10
            periodSeconds: 15
            failureThreshold: 3
          readinessProbe:
            httpGet:
              path: /ready
              port: 3000
            initialDelaySeconds: 5
            periodSeconds: 10
            failureThreshold: 3
          startupProbe:
            httpGet:
              path: /health
              port: 3000
            initialDelaySeconds: 0
            periodSeconds: 5
            failureThreshold: 30
          env:
            - name: NODE_ENV
              value: "production"
          envFrom:
            - configMapRef:
                name: my-api-config
            - secretRef:
                name: my-api-secrets
      terminationGracePeriodSeconds: 30

This is a lot of YAML, so let me break it down section by section.

Replicas and Selector#

yaml

replicas: 3
selector:
  matchLabels:
    app: my-api

Rolling Update Strategy#

yaml

strategy:
  type: RollingUpdate
  rollingUpdate:
    maxUnavailable: 1
    maxSurge: 1

The defaults are fine for most workloads. If you have a very small replica count (two or less), you might want maxSurge: 1 and maxUnavailable: 0 to ensure you never lose capacity during a deploy.

Resource Requests and Limits#

This is where most people get burned. I will cover it in detail in its own section below.

Probes#

The three probe types (liveness, readiness, startup) are critical. They each get their own section too.

YAML Demystified: It Is Just Configuration#

A lot of the Kubernetes anxiety comes from the YAML. There is a lot of it, it is deeply nested, and a single indentation error breaks everything silently. Let me take the mystery out of it.

Every Kubernetes YAML file follows the same structure:

yaml

apiVersion: <group/version>  # Which API this resource belongs to
kind: <ResourceType>          # What you're creating
metadata:                     # Name, namespace, labels, annotations
  name: my-thing
  namespace: default
  labels:
    app: my-thing
spec:                         # The actual configuration (varies by kind)
  ...

The apiVersion is confusing at first. Here is a cheat sheet for the resources you will actually use:

Resource	apiVersion
Pod	`v1`
Service	`v1`
ConfigMap	`v1`
Secret	`v1`
Deployment	`apps/v1`
Ingress	`networking.k8s.io/v1`
HPA	`autoscaling/v2`
CronJob	`batch/v1`

You do not need to memorize these. Every kubectl explain output shows the apiVersion, and your IDE with the Kubernetes extension will autocomplete them.

One tip that saved me a lot of pain: use --- to separate multiple resources in a single file. You can define a Deployment, Service, and ConfigMap in one file:

yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-api
spec:
  # ...
---
apiVersion: v1
kind: Service
metadata:
  name: my-api
spec:
  # ...
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: my-api-config
data:
  # ...

Apply it all with one command: kubectl apply -f my-api.yaml. This keeps related resources together and makes it easy to review in a PR.

Services: Stable Networking for Ephemeral Pods#

Pods get IP addresses, but those addresses change every time a pod restarts. If your frontend needs to talk to your API, it cannot hardcode the pod IP. Services solve this.

yaml

apiVersion: v1
kind: Service
metadata:
  name: my-api
  namespace: production
spec:
  selector:
    app: my-api
  ports:
    - port: 80
      targetPort: 3000
      protocol: TCP
  type: ClusterIP

There are three Service types that matter:

ClusterIP (default): Only accessible from inside the cluster. This is what you use for internal service-to-service communication. Your API talks to your database through a ClusterIP Service.

NodePort: Exposes the Service on a static port on every node. Traffic to <node-ip>:<node-port> gets forwarded to the Service. Useful for development and debugging, rarely used in production.

The fourth type, ExternalName, is a DNS alias for an external service. I have used it exactly twice in four years.

Ingress: HTTP Routing#

For HTTP/HTTPS traffic from the internet, you want an Ingress. It is like a reverse proxy configuration (think nginx) that routes based on hostname and path:

yaml

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: my-ingress
  namespace: production
  annotations:
    cert-manager.io/cluster-issuer: letsencrypt-prod
spec:
  ingressClassName: nginx
  tls:
    - hosts:
        - api.example.com
      secretName: api-tls
  rules:
    - host: api.example.com
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: my-api
                port:
                  number: 80

ConfigMaps and Secrets: Externalize Everything#

Hardcoding configuration into your container image is a common mistake. It means you need a different image for each environment, or worse, you rebuild and redeploy to change a configuration value.

ConfigMaps store non-sensitive configuration as key-value pairs:

yaml

apiVersion: v1
kind: ConfigMap
metadata:
  name: my-api-config
  namespace: production
data:
  LOG_LEVEL: "info"
  CACHE_TTL: "3600"
  FEATURE_NEW_DASHBOARD: "true"
  MAX_UPLOAD_SIZE: "10485760"

Secrets store sensitive data (passwords, API keys, certificates). They are base64-encoded, not encrypted — this is an important distinction:

yaml

apiVersion: v1
kind: Secret
metadata:
  name: my-api-secrets
  namespace: production
type: Opaque
data:
  DATABASE_URL: cG9zdGdyZXM6Ly91c2VyOnBhc3NAZGIuZXhhbXBsZS5jb206NTQzMi9teWRi
  REDIS_PASSWORD: c3VwZXJzZWNyZXRwYXNzd29yZA==
  API_KEY: YWJjZGVmMTIzNDU2Nzg5MA==

You can inject ConfigMaps and Secrets as environment variables or mount them as files:

yaml

# As environment variables (all keys at once)
envFrom:
  - configMapRef:
      name: my-api-config
  - secretRef:
      name: my-api-secrets
 
# As individual environment variables (pick specific keys)
env:
  - name: DB_HOST
    valueFrom:
      configMapKeyRef:
        name: my-api-config
        key: DB_HOST
  - name: DB_PASSWORD
    valueFrom:
      secretKeyRef:
        name: my-api-secrets
        key: DB_PASSWORD
 
# As mounted files
volumeMounts:
  - name: config-volume
    mountPath: /etc/config
    readOnly: true
volumes:
  - name: config-volume
    configMap:
      name: my-api-config

yaml

spec:
  template:
    metadata:
      annotations:
        checksum/config: "sha256-of-configmap-contents"

Health Checks: The Three Probes#

There are three types of probes, and they serve different purposes.

Liveness Probe: "Is the process stuck?"#

The liveness probe answers one question: should Kubernetes restart this container? If the liveness probe fails, Kubernetes kills the container and starts a new one.

yaml

livenessProbe:
  httpGet:
    path: /health
    port: 3000
  initialDelaySeconds: 10
  periodSeconds: 15
  timeoutSeconds: 3
  failureThreshold: 3

This checks GET /health every 15 seconds. If three consecutive checks fail (45 seconds total), the container gets restarted.

A good liveness endpoint:

javascript

app.get('/health', (req, res) => {
  // Can this process respond to HTTP? That's all we need to know.
  res.status(200).json({ status: 'ok' });
});

That is it. If the event loop is not blocked and the HTTP server can respond, the process is alive.

Readiness Probe: "Can it handle traffic?"#

yaml

readinessProbe:
  httpGet:
    path: /ready
    port: 3000
  initialDelaySeconds: 5
  periodSeconds: 10
  timeoutSeconds: 3
  failureThreshold: 3

This is where you check dependencies:

javascript

app.get('/ready', async (req, res) => {
  try {
    // Can we reach the database?
    await db.query('SELECT 1');
    // Can we reach the cache?
    await redis.ping();
    res.status(200).json({ status: 'ready' });
  } catch (err) {
    res.status(503).json({ status: 'not ready', error: err.message });
  }
});

Startup Probe: "Is it still booting?"#

yaml

startupProbe:
  httpGet:
    path: /health
    port: 3000
  initialDelaySeconds: 0
  periodSeconds: 5
  failureThreshold: 30

This gives the application up to 150 seconds (30 failures * 5 seconds) to start. Once the startup probe succeeds, it never runs again, and the liveness and readiness probes take over.

The Shutdown Side: Graceful Termination#

Probes handle the startup side. For the shutdown side, you need to handle SIGTERM properly.

Your application needs to handle SIGTERM:

javascript

process.on('SIGTERM', async () => {
  console.log('SIGTERM received. Starting graceful shutdown...');
 
  // Stop accepting new connections
  server.close(async () => {
    // Finish processing in-flight requests
    // Close database connections
    await db.end();
    await redis.quit();
 
    console.log('Graceful shutdown complete.');
    process.exit(0);
  });
 
  // Force shutdown after 25 seconds (leave 5s buffer before SIGKILL)
  setTimeout(() => {
    console.error('Forced shutdown after timeout.');
    process.exit(1);
  }, 25000);
});

The solution is a preStop hook that adds a small delay:

yaml

lifecycle:
  preStop:
    exec:
      command: ["sleep", "5"]

Resource Requests and Limits: Get These Wrong and Bad Things Happen#

This is the section I wish I had read before my first production incident. Resource management in Kubernetes is simultaneously simple to understand and incredibly easy to get wrong.

Every container should specify resource requests and limits for CPU and memory.

yaml

resources:
  requests:
    cpu: "100m"
    memory: "128Mi"
  limits:
    cpu: "500m"
    memory: "512Mi"

Limits are the maximum the container can use. If it exceeds the memory limit, it gets OOM-killed. If it exceeds the CPU limit, it gets throttled (not killed — it just runs slower).

The units are important:

CPU: measured in millicores. 1000m = 1 full CPU core. 100m = 10% of a core.
Memory: measured in bytes. Mi = mebibytes (1024-based), M = megabytes (1000-based). Use Mi to match what tools like top and free report.

The Most Common Mistake: No Limits At All#

yaml

# A reasonable starting point for a Node.js API
resources:
  requests:
    cpu: "100m"
    memory: "256Mi"
  limits:
    cpu: "500m"
    memory: "768Mi"

CPU Throttling: The Silent Killer#

Quality of Service Classes#

Kubernetes assigns each pod a QoS class based on its resource configuration:

Guaranteed: requests == limits for both CPU and memory. These pods are the last to be evicted.
Burstable: requests < limits. The most common configuration.
BestEffort: no requests or limits set at all. These pods are the first to be evicted when the node is under pressure.

For production workloads, you want Guaranteed or Burstable. BestEffort is only acceptable for batch jobs where you genuinely do not care if they get killed.

HPA: Automatic Horizontal Pod Autoscaling#

The Horizontal Pod Autoscaler (HPA) automatically adjusts the number of replicas based on observed metrics. The most common metric is CPU utilization:

yaml

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: my-api-hpa
  namespace: production
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: my-api
  minReplicas: 3
  maxReplicas: 20
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70
    - type: Resource
      resource:
        name: memory
        target:
          type: Utilization
          averageUtilization: 80
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
        - type: Percent
          value: 10
          periodSeconds: 60
    scaleUp:
      stabilizationWindowSeconds: 0
      policies:
        - type: Percent
          value: 100
          periodSeconds: 30
        - type: Pods
          value: 4
          periodSeconds: 60
      selectPolicy: Max

Debugging with kubectl: The Workflow That Actually Works#

When something goes wrong (and it will), you need a systematic debugging approach. Here is the workflow I follow every time.

Step 1: What Is the Current State?#

bash

# See all pods in a namespace
kubectl get pods -n production
 
# See more details (node, IP, restart count)
kubectl get pods -n production -o wide
 
# See all resources related to your app
kubectl get all -n production -l app=my-api

The output tells you the current state:

NAME                      READY   STATUS    RESTARTS   AGE
my-api-7d4b8c6f5-abc12   1/1     Running   0          2d
my-api-7d4b8c6f5-def34   0/1     Running   3          15m
my-api-7d4b8c6f5-ghi56   1/1     Running   0          2d

Step 2: Describe the Problem Pod#

bash

kubectl describe pod my-api-7d4b8c6f5-def34 -n production

Events:
  Type     Reason     Age                From               Message
  ----     ------     ----               ----               -------
  Normal   Scheduled  15m                default-scheduler  Assigned to node-3
  Normal   Pulling    15m                kubelet            Pulling image "registry.example.com/my-api:1.2.3"
  Normal   Pulled     14m                kubelet            Container image pulled
  Normal   Created    14m                kubelet            Created container my-api
  Normal   Started    14m                kubelet            Started container my-api
  Warning  Unhealthy  3m (x9 over 14m)   kubelet            Readiness probe failed: Get "http://10.244.2.15:3000/ready": dial tcp 10.244.2.15:3000: connect: connection refused
  Warning  BackOff    2m (x3 over 5m)    kubelet            Back-off restarting failed container

Now you know: the container starts but the readiness probe cannot connect on port 3000. The application is not listening on the expected port.

Step 3: Check the Logs#

bash

# Current container logs
kubectl logs my-api-7d4b8c6f5-def34 -n production
 
# Previous container's logs (before the restart)
kubectl logs my-api-7d4b8c6f5-def34 -n production --previous
 
# Follow logs in real-time
kubectl logs my-api-7d4b8c6f5-def34 -n production -f
 
# Last 100 lines
kubectl logs my-api-7d4b8c6f5-def34 -n production --tail=100
 
# Logs from all pods matching a label
kubectl logs -l app=my-api -n production --all-containers

The --previous flag is essential. When a container crashes and restarts, the current logs are from the new container. The crash output is in the previous container's logs.

Step 4: Get Inside the Container#

Sometimes logs are not enough and you need to poke around inside the container:

bash

# Open a shell in the running container
kubectl exec -it my-api-7d4b8c6f5-abc12 -n production -- /bin/sh
 
# Run a single command
kubectl exec my-api-7d4b8c6f5-abc12 -n production -- env
 
# Check if the app is listening
kubectl exec my-api-7d4b8c6f5-abc12 -n production -- netstat -tlnp
 
# Test connectivity to another service
kubectl exec my-api-7d4b8c6f5-abc12 -n production -- wget -qO- http://my-database:5432

Use /bin/sh instead of /bin/bash — many minimal container images do not include bash. If even sh is not available (distroless images), you can use ephemeral debug containers:

bash

kubectl debug -it my-api-7d4b8c6f5-abc12 -n production --image=busybox --target=my-api

Step 5: Port Forward for Local Testing#

Need to hit a Service or pod directly from your machine?

bash

# Forward local port 8080 to the Service's port 80
kubectl port-forward svc/my-api 8080:80 -n production
 
# Forward to a specific pod
kubectl port-forward my-api-7d4b8c6f5-abc12 8080:3000 -n production

Step 6: Check Resource Usage#

bash

# CPU and memory usage per pod
kubectl top pods -n production
 
# CPU and memory usage per node
kubectl top nodes
 
# Detailed resource view for a specific pod
kubectl describe pod my-api-7d4b8c6f5-abc12 -n production | grep -A5 "Requests\|Limits"

Compare the actual usage with the configured limits. If actual memory usage is close to the limit, you might be heading for an OOM kill.

The Emergency Commands#

These are for when things are on fire:

bash

# Restart all pods in a deployment (rolling restart)
kubectl rollout restart deployment/my-api -n production
 
# Roll back to the previous version
kubectl rollout undo deployment/my-api -n production
 
# Roll back to a specific revision
kubectl rollout history deployment/my-api -n production
kubectl rollout undo deployment/my-api -n production --to-revision=5
 
# Scale to zero (emergency stop)
kubectl scale deployment/my-api -n production --replicas=0
 
# Scale back up
kubectl scale deployment/my-api -n production --replicas=3

rollout undo has saved me multiple times. It is instant and does not require rebuilding or redeploying anything.

Helm: The Package Manager You Will Eventually Use#

A Helm chart is a directory with templates and a values file:

my-api-chart/
  Chart.yaml          # Chart metadata
  values.yaml         # Default configuration values
  templates/
    deployment.yaml   # Kubernetes manifest templates
    service.yaml
    ingress.yaml
    configmap.yaml
    hpa.yaml
    _helpers.tpl      # Template helper functions

The templates use Go templating to inject values:

yaml

# templates/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: {{ include "my-api.fullname" . }}
  labels:
    {{- include "my-api.labels" . | nindent 4 }}
spec:
  replicas: {{ .Values.replicaCount }}
  selector:
    matchLabels:
      {{- include "my-api.selectorLabels" . | nindent 6 }}
  template:
    spec:
      containers:
        - name: {{ .Chart.Name }}
          image: "{{ .Values.image.repository }}:{{ .Values.image.tag }}"
          resources:
            {{- toYaml .Values.resources | nindent 12 }}

And the values file provides the defaults:

yaml

# values.yaml
replicaCount: 3
 
image:
  repository: registry.example.com/my-api
  tag: "1.2.3"
 
resources:
  requests:
    cpu: 100m
    memory: 128Mi
  limits:
    cpu: 500m
    memory: 512Mi

You deploy with:

bash

# Install a chart
helm install my-api ./my-api-chart -n production
 
# Upgrade with new values
helm upgrade my-api ./my-api-chart -n production --set image.tag=1.2.4
 
# Upgrade with a values file for a specific environment
helm upgrade my-api ./my-api-chart -n production -f values-production.yaml
 
# Roll back
helm rollback my-api 1 -n production
 
# See what would change without applying
helm diff upgrade my-api ./my-api-chart -n production --set image.tag=1.2.4

The real power of Helm is environment-specific values files. You have one chart and multiple values files:

values-development.yaml   # 1 replica, low resources, debug logging
values-staging.yaml       # 2 replicas, medium resources
values-production.yaml    # 3+ replicas, high resources, HPA enabled

Namespaces: Lightweight Isolation#

bash

# Create a namespace
kubectl create namespace staging
 
# List all namespaces
kubectl get namespaces
 
# Set your default namespace (so you don't need -n every time)
kubectl config set-context --current --namespace=production

A typical namespace strategy:

production      # Live traffic
staging         # Pre-production testing
development     # Developer environments
monitoring      # Prometheus, Grafana, etc.
ingress-nginx   # Ingress controller
cert-manager    # Certificate management

Resource quotas prevent one namespace from hogging all cluster resources:

yaml

apiVersion: v1
kind: ResourceQuota
metadata:
  name: production-quota
  namespace: production
spec:
  hard:
    requests.cpu: "10"
    requests.memory: 20Gi
    limits.cpu: "20"
    limits.memory: 40Gi
    pods: "50"

Common Mistakes (I Made All of These)#

Mistake 1: Using the `latest` Tag#

yaml

# DO NOT DO THIS
image: registry.example.com/my-api:latest

The latest tag is mutable. It points to whatever was most recently pushed. This means:

Deployments are not reproducible. The same manifest deploys different code at different times.
Rollbacks do not work. "Roll back to the previous version" means nothing when the tag is latest.
Image pull policy defaults matter. With latest, Kubernetes defaults to Always pull, which can cause failed deployments if the registry is down.

Always use immutable tags: semantic versions (1.2.3), git SHAs (abc1234), or build numbers (build-567).

Mistake 2: No Resource Limits#

Already covered above. Just do it. Every container. No exceptions. A five-minute task that prevents hours of debugging resource contention issues.

Mistake 3: No Health Checks#

Also covered above. At minimum, add a liveness probe. Ideally, add all three probes. Your future self at 2 AM will thank you.

Mistake 4: Not Setting `terminationGracePeriodSeconds`#

yaml

spec:
  terminationGracePeriodSeconds: 120

Mistake 5: Ignoring Pod Disruption Budgets#

yaml

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: my-api-pdb
  namespace: production
spec:
  minAvailable: 2
  selector:
    matchLabels:
      app: my-api

This ensures at least 2 pods are always running, even during maintenance. If you have 3 replicas, Kubernetes can only drain one pod at a time.

Mistake 6: One Giant Namespace#

Putting everything in the default namespace is fine for learning. For anything else, use namespaces. They cost nothing, make kubectl get pods readable, and enable per-team resource quotas.

Mistake 7: Not Using `kubectl diff`#

Before applying changes, check what will change:

bash

kubectl diff -f my-deployment.yaml

Mistake 8: Skipping Network Policies#

yaml

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: my-api-network-policy
  namespace: production
spec:
  podSelector:
    matchLabels:
      app: my-api
  policyTypes:
    - Ingress
    - Egress
  ingress:
    - from:
        - podSelector:
            matchLabels:
              app: my-frontend
      ports:
        - port: 3000
  egress:
    - to:
        - podSelector:
            matchLabels:
              app: my-database
      ports:
        - port: 5432

This says: my-api can only receive traffic from my-frontend on port 3000, and can only send traffic to my-database on port 5432. Everything else is blocked.

When NOT to Use Kubernetes#

This is the section that Kubernetes evangelists skip and that I think is the most important.

Kubernetes has significant operational overhead. Even with managed services like EKS, GKE, or AKS, you are dealing with:

The learning curve (weeks to months for a team)
YAML management (hundreds of lines for a simple deployment)
Networking complexity (Services, Ingress, DNS, Network Policies)
Monitoring and observability requirements
Cost (managed K8s control planes are not cheap, plus the nodes)
Upgrade maintenance (Kubernetes versions deprecate fast)
Debugging difficulty (more layers = more places things can go wrong)

You probably do not need Kubernetes if:

You have fewer than five services
Your team is fewer than five developers
You do not need automatic scaling
Your traffic is predictable
A single server handles your load
You are a startup that has not found product-market fit yet

For a single application or a small set of services, here are alternatives that give you 80% of the benefit at 20% of the complexity:

Docker Compose + a single VPS: For small projects. Dead simple, cheap, easy to debug.
Docker Swarm: Multi-node container orchestration that is dramatically simpler than Kubernetes. Fewer features, but also fewer footguns.
Cloud Run / App Runner / Azure Container Apps: Managed container platforms that handle scaling automatically. No cluster to manage.
Platform-as-a-Service (Railway, Render, Fly.io): Push code, get a URL. These have gotten really good.

The Honest Take#

And set your resource limits. Seriously. Do it now.

Why Kubernetes Exists (The Real Reason)#

The Mental Model: Think Layers#

Pods: Not Just Containers#

Deployments: The Thing You Actually Create#

Replicas and Selector#

Rolling Update Strategy#

Resource Requests and Limits#

Probes#

YAML Demystified: It Is Just Configuration#

Services: Stable Networking for Ephemeral Pods#

Ingress: HTTP Routing#

ConfigMaps and Secrets: Externalize Everything#

Health Checks: The Three Probes#

Liveness Probe: "Is the process stuck?"#

Readiness Probe: "Can it handle traffic?"#

Startup Probe: "Is it still booting?"#

The Shutdown Side: Graceful Termination#

Resource Requests and Limits: Get These Wrong and Bad Things Happen#

The Most Common Mistake: No Limits At All#

CPU Throttling: The Silent Killer#

Quality of Service Classes#

HPA: Automatic Horizontal Pod Autoscaling#

Debugging with kubectl: The Workflow That Actually Works#

Step 1: What Is the Current State?#

Step 2: Describe the Problem Pod#

Step 3: Check the Logs#

Step 4: Get Inside the Container#

Step 5: Port Forward for Local Testing#

Step 6: Check Resource Usage#

The Emergency Commands#

Helm: The Package Manager You Will Eventually Use#

Namespaces: Lightweight Isolation#

Common Mistakes (I Made All of These)#

Mistake 1: Using the latest Tag#

Mistake 2: No Resource Limits#

Mistake 3: No Health Checks#

Mistake 4: Not Setting terminationGracePeriodSeconds#

Mistake 5: Ignoring Pod Disruption Budgets#

Mistake 6: One Giant Namespace#

Mistake 7: Not Using kubectl diff#

Mistake 8: Skipping Network Policies#

When NOT to Use Kubernetes#

The Honest Take#

관련 게시물

Cron Expression Guide: Schedule Tasks Like a Pro

Scaling to One Million Users: The Infrastructure Playbook Nobody Shares

Why Kubernetes Exists (The Real Reason)#

The Mental Model: Think Layers#

Pods: Not Just Containers#

Deployments: The Thing You Actually Create#

Replicas and Selector#

Rolling Update Strategy#

Resource Requests and Limits#

Probes#

YAML Demystified: It Is Just Configuration#

Services: Stable Networking for Ephemeral Pods#

Ingress: HTTP Routing#

ConfigMaps and Secrets: Externalize Everything#

Health Checks: The Three Probes#

Liveness Probe: "Is the process stuck?"#

Readiness Probe: "Can it handle traffic?"#

Startup Probe: "Is it still booting?"#

The Shutdown Side: Graceful Termination#

Resource Requests and Limits: Get These Wrong and Bad Things Happen#

The Most Common Mistake: No Limits At All#

CPU Throttling: The Silent Killer#

Quality of Service Classes#

HPA: Automatic Horizontal Pod Autoscaling#

Debugging with kubectl: The Workflow That Actually Works#

Step 1: What Is the Current State?#

Step 2: Describe the Problem Pod#

Step 3: Check the Logs#

Step 4: Get Inside the Container#

Step 5: Port Forward for Local Testing#

Step 6: Check Resource Usage#

The Emergency Commands#

Helm: The Package Manager You Will Eventually Use#

Namespaces: Lightweight Isolation#

Common Mistakes (I Made All of These)#

Mistake 1: Using the latest Tag#

Mistake 1: Using the `latest` Tag#

Mistake 4: Not Setting `terminationGracePeriodSeconds`#

Mistake 7: Not Using `kubectl diff`#

Mistake 1: Using the `latest` Tag#

Mistake 4: Not Setting `terminationGracePeriodSeconds`#

Mistake 7: Not Using `kubectl diff`#