Strategie di caching con Redis che funzionano davvero in produzione

Tutti ti dicono di "aggiungere Redis" quando la tua API è lenta. Nessuno ti dice cosa succede sei mesi dopo quando la tua cache serve dati stali, la tua logica di invalidazione è sparsa in 40 file, e un deploy causa un cache stampede che manda giù il tuo database peggio di come sarebbe stato senza cache.

Uso Redis in produzione da anni. Non come un giocattolo, non in un tutorial — in sistemi che gestiscono traffico reale dove sbagliare il caching significa alert del pager alle 3 di notte. Quello che segue è tutto ciò che ho imparato sul farlo bene.

Perché fare caching?#

Partiamo dall'ovvio: i database sono lenti rispetto alla memoria. Una query PostgreSQL che impiega 15ms è veloce per gli standard dei database. Ma se quella query gira su ogni singola richiesta API, e stai gestendo 1.000 richieste al secondo, sono 15.000ms di tempo cumulativo di database al secondo. Il tuo connection pool è esaurito. La tua latenza p99 è alle stelle. Gli utenti fissano gli spinner.

Redis serve la maggior parte delle letture in meno di 1ms. Gli stessi dati, in cache, trasformano un'operazione da 15ms in una da 0.3ms. Non è una micro-ottimizzazione. È la differenza tra aver bisogno di 4 repliche del database e non averne bisogno di nessuna.

Ma il caching non è gratis. Aggiunge complessità, introduce problemi di consistenza e crea un'intera nuova classe di modalità di fallimento. Prima di mettere in cache qualsiasi cosa, chiediti:

Quando il caching aiuta:

I dati vengono letti molto più spesso di quanto vengano scritti (rapporto 10:1 o superiore)
La query sottostante è costosa (join, aggregazioni, chiamate ad API esterne)
Una leggera mancanza di freschezza è accettabile (catalogo prodotti, profili utente, configurazione)
Hai pattern di accesso prevedibili (le stesse chiavi vengono richieste ripetutamente)

Quando il caching fa male:

I dati cambiano costantemente e devono essere freschi (prezzi azionari in tempo reale, risultati sportivi live)
Ogni richiesta è unica (query di ricerca con molti parametri)
Il tuo dataset è minuscolo (se entra tutto nella memoria della tua app, salta Redis)
Non hai la maturità operativa per monitorare e debuggare problemi di cache

Phil Karlton disse notoriamente che ci sono solo due cose difficili nell'informatica: l'invalidazione della cache e dare nomi alle cose. Aveva ragione su entrambe, ma l'invalidazione della cache è quella che ti sveglia di notte.

Configurare ioredis#

Prima di addentrarci nei pattern, stabiliamo la connessione. Uso ioredis ovunque — è il client Redis più maturo per Node.js, con supporto TypeScript adeguato, modalità cluster, supporto Sentinel e scripting Lua.

typescript

import Redis from "ioredis";
 
const redis = new Redis({
  host: process.env.REDIS_HOST || "127.0.0.1",
  port: Number(process.env.REDIS_PORT) || 6379,
  password: process.env.REDIS_PASSWORD || undefined,
  db: Number(process.env.REDIS_DB) || 0,
  maxRetriesPerRequest: 3,
  retryStrategy(times) {
    const delay = Math.min(times * 200, 5000);
    return delay;
  },
  lazyConnect: true,
  enableReadyCheck: true,
  connectTimeout: 10000,
});
 
redis.on("error", (err) => {
  console.error("[Redis] Connection error:", err.message);
});
 
redis.on("connect", () => {
  console.log("[Redis] Connected");
});
 
export default redis;

Alcune cose da notare. lazyConnect: true significa che la connessione non viene stabilita finché non esegui effettivamente un comando, il che è utile durante il testing e l'inizializzazione. retryStrategy implementa un backoff esponenziale con tetto a 5 secondi — senza questo, un'interruzione di Redis causa un bombardamento di tentativi di riconnessione dalla tua app. E maxRetriesPerRequest: 3 assicura che i singoli comandi falliscano velocemente invece di restare appesi per sempre.

Pattern Cache-Aside#

Questo è il pattern che userai l'80% delle volte. Si chiama anche "lazy loading" o "look-aside." Il flusso è semplice:

L'applicazione riceve una richiesta
Controlla Redis per il valore in cache
Se trovato (cache hit), lo restituisce
Se non trovato (cache miss), interroga il database
Memorizza il risultato in Redis
Restituisce il risultato

Ecco un'implementazione tipizzata:

typescript

import redis from "./redis";
 
interface CacheOptions {
  ttl?: number;       // seconds
  prefix?: string;
}
 
async function cacheAside<T>(
  key: string,
  fetcher: () => Promise<T>,
  options: CacheOptions = {}
): Promise<T> {
  const { ttl = 3600, prefix = "cache" } = options;
  const cacheKey = `${prefix}:${key}`;
 
  // Step 1: Try to read from cache
  const cached = await redis.get(cacheKey);
 
  if (cached !== null) {
    try {
      return JSON.parse(cached) as T;
    } catch {
      // Corrupted cache entry, delete it and fall through
      await redis.del(cacheKey);
    }
  }
 
  // Step 2: Cache miss — fetch from source
  const result = await fetcher();
 
  // Step 3: Store in cache (don't await — fire and forget)
  redis
    .set(cacheKey, JSON.stringify(result), "EX", ttl)
    .catch((err) => {
      console.error(`[Cache] Failed to set ${cacheKey}:`, err.message);
    });
 
  return result;
}

L'utilizzo è così:

typescript

interface User {
  id: string;
  name: string;
  email: string;
  plan: "free" | "pro" | "enterprise";
}
 
async function getUser(userId: string): Promise<User | null> {
  return cacheAside<User | null>(
    `user:${userId}`,
    async () => {
      const row = await db.query("SELECT * FROM users WHERE id = $1", [userId]);
      return row[0] ?? null;
    },
    { ttl: 1800 } // 30 minuti
  );
}

Nota che faccio fire-and-forget sulla chiamata redis.set. È intenzionale. Se Redis è giù o lento, la richiesta si completa comunque. La cache è un'ottimizzazione, non un requisito. Se la scrittura in cache fallisce, la prossima richiesta colpirà semplicemente il database di nuovo. Nessun problema.

C'è un bug sottile in molte implementazioni cache-aside che la gente non nota: il caching dei valori null. Se un utente non esiste e non metti in cache quel fatto, ogni richiesta per quell'utente colpisce il database. Un attaccante può sfruttare questo richiedendo ID utente casuali, trasformando la tua cache in un no-op. Metti sempre in cache anche il risultato negativo — solo con un TTL più breve.

typescript

async function getUserSafe(userId: string): Promise<User | null> {
  return cacheAside<User | null>(
    `user:${userId}`,
    async () => {
      const row = await db.query("SELECT * FROM users WHERE id = $1", [userId]);
      return row[0] ?? null;
    },
    {
      // Shorter TTL for null results to limit memory usage
      // but long enough to absorb repeated misses
      ttl: row ? 1800 : 300,
    }
  );
}

In realtà, lascia che ristrutturi il codice per far funzionare correttamente il TTL dinamico:

typescript

async function getUserWithDynamicTTL(userId: string): Promise<User | null> {
  const cacheKey = `cache:user:${userId}`;
 
  const cached = await redis.get(cacheKey);
  if (cached !== null) {
    return JSON.parse(cached) as User | null;
  }
 
  const row = await db.query("SELECT * FROM users WHERE id = $1", [userId]);
  const user: User | null = row[0] ?? null;
 
  // Cache exists results for 30 min, null results for 5 min
  const ttl = user ? 1800 : 300;
  await redis.set(cacheKey, JSON.stringify(user), "EX", ttl);
 
  return user;
}

Write-Through e Write-Behind#

Cache-aside funziona benissimo per workload a lettura intensiva, ma ha un problema di consistenza: se un altro servizio o processo aggiorna il database direttamente, la tua cache è stale fino a quando il TTL non scade. Entrano in gioco i pattern write-through e write-behind.

Write-Through#

Nel write-through, ogni scrittura passa attraverso il livello di cache. La cache viene aggiornata per prima, poi il database. Questo garantisce che la cache sia sempre consistente con il database (a patto che le scritture passino sempre dalla tua applicazione).

typescript

async function updateUser(
  userId: string,
  updates: Partial<User>
): Promise<User> {
  // Step 1: Update the database
  const updated = await db.query(
    "UPDATE users SET name = COALESCE($2, name), email = COALESCE($3, email) WHERE id = $1 RETURNING *",
    [userId, updates.name, updates.email]
  );
  const user: User = updated[0];
 
  // Step 2: Update the cache immediately
  const cacheKey = `cache:user:${userId}`;
  await redis.set(cacheKey, JSON.stringify(user), "EX", 1800);
 
  return user;
}

La differenza chiave dal cache-aside: scriviamo in cache ad ogni scrittura, non solo alle letture. Questo significa che la cache è sempre calda per i dati aggiornati di recente.

Il compromesso: la latenza di scrittura aumenta perché ogni scrittura ora tocca sia il database che Redis. Se Redis è lento, le tue scritture sono lente. Nella maggior parte delle applicazioni, le letture superano di gran lunga le scritture, quindi questo compromesso vale la pena.

Write-Behind (Write-Back)#

Write-behind ribalta lo schema: le scritture vanno prima su Redis, e il database viene aggiornato in modo asincrono. Questo ti dà scritture estremamente veloci al costo di potenziale perdita di dati se Redis va giù prima che i dati vengano persistiti.

typescript

async function updateUserWriteBehind(
  userId: string,
  updates: Partial<User>
): Promise<User> {
  const cacheKey = `cache:user:${userId}`;
 
  // Read current state
  const current = await redis.get(cacheKey);
  const user = current ? JSON.parse(current) as User : null;
  if (!user) throw new Error("User not in cache");
 
  // Update cache immediately
  const updated = { ...user, ...updates };
  await redis.set(cacheKey, JSON.stringify(updated), "EX", 1800);
 
  // Queue database write for async processing
  await redis.rpush(
    "write_behind:users",
    JSON.stringify({ userId, updates, timestamp: Date.now() })
  );
 
  return updated;
}

Avresti poi un worker separato che drena quella coda:

typescript

async function processWriteBehindQueue(): Promise<void> {
  while (true) {
    const item = await redis.blpop("write_behind:users", 5);
 
    if (item) {
      const { userId, updates } = JSON.parse(item[1]);
      try {
        await db.query(
          "UPDATE users SET name = COALESCE($2, name), email = COALESCE($3, email) WHERE id = $1",
          [userId, updates.name, updates.email]
        );
      } catch (err) {
        // Re-queue on failure with retry count
        console.error("[WriteBehind] Failed:", err);
        await redis.rpush("write_behind:users:dlq", item[1]);
      }
    }
  }
}

Uso raramente write-behind in pratica. Il rischio di perdita dati è reale — se Redis crasha prima che il worker processi la coda, quelle scritture sono perse. Usa questo solo per dati dove l'eventual consistency è genuinamente accettabile, come contatori di visualizzazioni, eventi analytics, o preferenze utente non critiche.

Strategia TTL#

Impostare correttamente il TTL è più sfumato di quanto sembri. Un TTL fisso di 1 ora su tutto è facile da implementare e quasi sempre sbagliato.

Livelli di volatilità dei dati#

Categorizzo i dati in tre livelli e assegno i TTL di conseguenza:

typescript

const TTL = {
  // Tier 1: Rarely changes, expensive to compute
  // Examples: product catalog, site config, feature flags
  STATIC: 86400,       // 24 hours
 
  // Tier 2: Changes occasionally, moderate cost
  // Examples: user profiles, team settings, permissions
  MODERATE: 1800,      // 30 minutes
 
  // Tier 3: Changes frequently, cheap to compute but called often
  // Examples: feed data, notification counts, session info
  VOLATILE: 300,       // 5 minutes
 
  // Tier 4: Ephemeral, used for rate limiting and locks
  EPHEMERAL: 60,       // 1 minute
 
  // Null results: always short-lived
  NOT_FOUND: 120,      // 2 minutes
} as const;

TTL Jitter: prevenire il Thundering Herd#

Ecco uno scenario che mi ha morso: fai il deploy della tua app, la cache è vuota, e 10.000 richieste mettono tutte in cache gli stessi dati con un TTL di 1 ora. Un'ora dopo, tutte le 10.000 chiavi scadono simultaneamente. Tutte le 10.000 richieste colpiscono il database contemporaneamente. Il database si strozza. Ho visto questo mandare giù un'istanza Postgres in produzione.

La soluzione è il jitter — aggiungere casualità ai valori TTL:

typescript

function ttlWithJitter(baseTtl: number, jitterPercent = 0.1): number {
  const jitter = baseTtl * jitterPercent;
  const offset = Math.random() * jitter * 2 - jitter;
  return Math.max(1, Math.round(baseTtl + offset));
}
 
// Instead of: redis.set(key, value, "EX", 3600)
// Use:        redis.set(key, value, "EX", ttlWithJitter(3600))
 
// 3600 ± 10% = random value between 3240 and 3960

Questo distribuisce le scadenze in una finestra, quindi invece di 10.000 chiavi che scadono nello stesso secondo, scadono in una finestra di 12 minuti. Il database vede un aumento graduale del traffico, non un precipizio.

Per i percorsi critici, vado oltre e uso il 20% di jitter:

typescript

const ttl = ttlWithJitter(3600, 0.2); // 2880–4320 seconds

Sliding Expiry#

Per dati simili a sessioni dove il TTL dovrebbe resettarsi ad ogni accesso, usa GETEX (Redis 6.2+):

typescript

async function getWithSlidingExpiry<T>(
  key: string,
  ttl: number
): Promise<T | null> {
  // GETEX atomically gets the value AND resets the TTL
  const value = await redis.getex(key, "EX", ttl);
  if (value === null) return null;
  return JSON.parse(value) as T;
}

Se sei su una versione Redis più vecchia, usa una pipeline:

typescript

async function getWithSlidingExpiryCompat<T>(
  key: string,
  ttl: number
): Promise<T | null> {
  const pipeline = redis.pipeline();
  pipeline.get(key);
  pipeline.expire(key, ttl);
  const results = await pipeline.exec();
 
  if (!results || !results[0] || results[0][1] === null) return null;
  return JSON.parse(results[0][1] as string) as T;
}

Cache Stampede (Thundering Herd)#

Il TTL jitter aiuta con le scadenze di massa, ma non risolve lo stampede a chiave singola: quando una chiave popolare scade e centinaia di richieste concorrenti cercano tutte di rigenerarla simultaneamente.

Immagina di mettere in cache il feed della homepage con un TTL di 5 minuti. Scade. Cinquanta richieste concorrenti vedono il cache miss. Tutte e cinquanta colpiscono il database con la stessa query costosa. Ti sei fatto un DDoS da solo.

Soluzione 1: Mutex Lock#

Solo una richiesta rigenera la cache. Tutte le altre aspettano.

typescript

async function cacheAsideWithMutex<T>(
  key: string,
  fetcher: () => Promise<T>,
  ttl: number = 3600
): Promise<T | null> {
  const cacheKey = `cache:${key}`;
  const lockKey = `lock:${key}`;
 
  // Try cache first
  const cached = await redis.get(cacheKey);
  if (cached !== null) {
    return JSON.parse(cached) as T;
  }
 
  // Try to acquire lock (NX = only if not exists, EX = auto-expire)
  const acquired = await redis.set(lockKey, "1", "EX", 10, "NX");
 
  if (acquired) {
    try {
      // We got the lock — fetch and cache
      const result = await fetcher();
      await redis.set(
        cacheKey,
        JSON.stringify(result),
        "EX",
        ttlWithJitter(ttl)
      );
      return result;
    } finally {
      // Release lock
      await redis.del(lockKey);
    }
  }
 
  // Another request holds the lock — wait and retry
  await sleep(100);
 
  const retried = await redis.get(cacheKey);
  if (retried !== null) {
    return JSON.parse(retried) as T;
  }
 
  // Still no cache — fall through to database
  // (this handles the case where the lock holder failed)
  return fetcher();
}
 
function sleep(ms: number): Promise<void> {
  return new Promise((resolve) => setTimeout(resolve, ms));
}

C'è una race condition sottile nel rilascio del lock qui sopra. Se il titolare del lock impiega più di 10 secondi (il TTL del lock), un'altra richiesta acquisisce il lock, e poi la prima richiesta cancella il lock della seconda richiesta. La correzione appropriata è usare un token univoco:

typescript

import { randomUUID } from "crypto";
 
async function acquireLock(
  lockKey: string,
  ttl: number
): Promise<string | null> {
  const token = randomUUID();
  const acquired = await redis.set(lockKey, token, "EX", ttl, "NX");
  return acquired ? token : null;
}
 
async function releaseLock(lockKey: string, token: string): Promise<boolean> {
  // Lua script ensures atomic check-and-delete
  const script = `
    if redis.call("get", KEYS[1]) == ARGV[1] then
      return redis.call("del", KEYS[1])
    else
      return 0
    end
  `;
  const result = await redis.eval(script, 1, lockKey, token);
  return result === 1;
}

Questo è essenzialmente un Redlock semplificato. Per un'istanza Redis singola, è sufficiente. Per setup Redis Cluster o Sentinel, guarda l'algoritmo Redlock completo — ma onestamente, per la prevenzione dello stampede della cache, questa versione semplice funziona bene.

Soluzione 2: Scadenza anticipata probabilistica#

Questo è il mio approccio preferito. Invece di aspettare che la chiave scada, rigenerala casualmente poco prima della scadenza. L'idea viene da un paper di Vattani, Chierichetti e Lowenstein.

typescript

interface CachedValue<T> {
  data: T;
  cachedAt: number;
  ttl: number;
}
 
async function cacheWithEarlyExpiration<T>(
  key: string,
  fetcher: () => Promise<T>,
  ttl: number = 3600
): Promise<T> {
  const cacheKey = `cache:${key}`;
  const cached = await redis.get(cacheKey);
 
  if (cached !== null) {
    const entry = JSON.parse(cached) as CachedValue<T>;
    const age = (Date.now() - entry.cachedAt) / 1000;
    const remaining = entry.ttl - age;
 
    // XFetch algorithm: probabilistically regenerate as expiry approaches
    // beta * Math.log(Math.random()) produces a negative number
    // that grows larger (more negative) as expiry approaches
    const beta = 1; // tuning parameter, 1 works well
    const shouldRegenerate =
      remaining - beta * Math.log(Math.random()) * -1 <= 0;
 
    if (!shouldRegenerate) {
      return entry.data;
    }
 
    // Fall through to regenerate
    console.log(`[Cache] Early regeneration triggered for ${key}`);
  }
 
  const data = await fetcher();
  const entry: CachedValue<T> = {
    data,
    cachedAt: Date.now(),
    ttl,
  };
 
  // Set with extra buffer so Redis doesn't expire before we can regenerate
  await redis.set(
    cacheKey,
    JSON.stringify(entry),
    "EX",
    Math.round(ttl * 1.1)
  );
 
  return data;
}

La bellezza di questo approccio: man mano che il TTL residuo della chiave diminuisce, la probabilità di rigenerazione aumenta. Con 1.000 richieste concorrenti, forse una o due attiveranno la rigenerazione mentre il resto continua a servire dati in cache. Nessun lock, nessun coordinamento, nessuna attesa.

Soluzione 3: Stale-While-Revalidate#

Servi il valore stale mentre rigeneri in background. Questo dà la migliore latenza perché nessuna richiesta attende mai il fetcher.

typescript

async function staleWhileRevalidate<T>(
  key: string,
  fetcher: () => Promise<T>,
  options: {
    freshTtl: number;   // how long the data is "fresh"
    staleTtl: number;   // how long stale data can be served
  }
): Promise<T | null> {
  const cacheKey = `cache:${key}`;
  const metaKey = `meta:${key}`;
 
  const [cached, meta] = await redis.mget(cacheKey, metaKey);
 
  if (cached !== null) {
    const parsedMeta = meta ? JSON.parse(meta) : null;
    const isFresh =
      parsedMeta && Date.now() - parsedMeta.cachedAt < options.freshTtl * 1000;
 
    if (!isFresh) {
      // Data is stale — serve it but trigger background refresh
      revalidateInBackground(key, cacheKey, metaKey, fetcher, options);
    }
 
    return JSON.parse(cached) as T;
  }
 
  // Complete cache miss — must fetch synchronously
  return fetchAndCache(key, cacheKey, metaKey, fetcher, options);
}
 
async function fetchAndCache<T>(
  key: string,
  cacheKey: string,
  metaKey: string,
  fetcher: () => Promise<T>,
  options: { freshTtl: number; staleTtl: number }
): Promise<T> {
  const data = await fetcher();
  const totalTtl = options.freshTtl + options.staleTtl;
 
  const pipeline = redis.pipeline();
  pipeline.set(cacheKey, JSON.stringify(data), "EX", totalTtl);
  pipeline.set(
    metaKey,
    JSON.stringify({ cachedAt: Date.now() }),
    "EX",
    totalTtl
  );
  await pipeline.exec();
 
  return data;
}
 
function revalidateInBackground<T>(
  key: string,
  cacheKey: string,
  metaKey: string,
  fetcher: () => Promise<T>,
  options: { freshTtl: number; staleTtl: number }
): void {
  // Use a lock to prevent multiple background refreshes
  const lockKey = `revalidate_lock:${key}`;
 
  redis
    .set(lockKey, "1", "EX", 30, "NX")
    .then((acquired) => {
      if (!acquired) return;
 
      return fetchAndCache(key, cacheKey, metaKey, fetcher, options)
        .finally(() => redis.del(lockKey));
    })
    .catch((err) => {
      console.error(`[SWR] Background revalidation failed for ${key}:`, err);
    });
}

Utilizzo:

typescript

const user = await staleWhileRevalidate<User>("user:123", fetchUserFromDB, {
  freshTtl: 300,     // 5 minuti fresco
  staleTtl: 3600,    // servi stale per max 1 ora mentre rivalidhi
});

Uso questo pattern per tutto ciò che è rivolto all'utente dove la latenza conta più della freschezza assoluta. Dati dashboard, pagine profilo, elenchi prodotti — tutti candidati perfetti.

Invalidazione della cache#

Phil Karlton non stava scherzando. L'invalidazione è dove il caching passa da "ottimizzazione facile" a "problema di sistemi distribuiti."

Invalidazione semplice basata su chiave#

Il caso più facile: quando aggiorni un utente, cancella la sua chiave di cache.

typescript

async function updateUserAndInvalidate(
  userId: string,
  updates: Partial<User>
): Promise<User> {
  const user = await db.query(
    "UPDATE users SET name = $2 WHERE id = $1 RETURNING *",
    [userId, updates.name]
  );
 
  // Invalidate the cache
  await redis.del(`cache:user:${userId}`);
 
  return user[0];
}

Questo funziona finché i dati dell'utente non compaiono in altri risultati in cache. Forse sono incorporati in una lista di membri del team. Forse sono in un risultato di ricerca. Forse sono in 14 diverse risposte API in cache. Ora devi tracciare quali chiavi di cache contengono quali entità.

Invalidazione basata su tag#

Tagga le voci della cache con le entità che contengono, poi invalida per tag.

typescript

async function setWithTags<T>(
  key: string,
  value: T,
  ttl: number,
  tags: string[]
): Promise<void> {
  const pipeline = redis.pipeline();
 
  // Store the value
  pipeline.set(`cache:${key}`, JSON.stringify(value), "EX", ttl);
 
  // Add the key to each tag's set
  for (const tag of tags) {
    pipeline.sadd(`tag:${tag}`, `cache:${key}`);
    pipeline.expire(`tag:${tag}`, ttl + 3600); // Tag sets live longer than values
  }
 
  await pipeline.exec();
}
 
async function invalidateByTag(tag: string): Promise<number> {
  const keys = await redis.smembers(`tag:${tag}`);
 
  if (keys.length === 0) return 0;
 
  const pipeline = redis.pipeline();
  for (const key of keys) {
    pipeline.del(key);
  }
  pipeline.del(`tag:${tag}`);
 
  await pipeline.exec();
  return keys.length;
}

Utilizzo:

typescript

// When caching team data, tag it with all member IDs
const team = await fetchTeam(teamId);
await setWithTags(
  `team:${teamId}`,
  team,
  1800,
  [
    `entity:team:${teamId}`,
    ...team.members.map((m) => `entity:user:${m.id}`),
  ]
);
 
// When user 42 updates their profile, invalidate everything that contains them
await invalidateByTag("entity:user:42");

Invalidazione event-driven#

Per sistemi più grandi, usa Redis Pub/Sub per trasmettere eventi di invalidazione:

typescript

// Publisher (in your API service)
async function publishInvalidation(
  entityType: string,
  entityId: string
): Promise<void> {
  await redis.publish(
    "cache:invalidate",
    JSON.stringify({ entityType, entityId, timestamp: Date.now() })
  );
}
 
// Subscriber (in each app instance)
const subscriber = new Redis(/* same config */);
 
subscriber.subscribe("cache:invalidate", (err) => {
  if (err) console.error("[PubSub] Subscribe error:", err);
});
 
subscriber.on("message", async (_channel, message) => {
  const { entityType, entityId } = JSON.parse(message);
  await invalidateByTag(`entity:${entityType}:${entityId}`);
  console.log(`[Cache] Invalidated ${entityType}:${entityId}`);
});

Questo è critico nei deployment multi-istanza. Se hai 4 server app dietro un load balancer, un'invalidazione sul server 1 deve propagarsi a tutti i server. Pub/Sub gestisce questo automaticamente.

Invalidazione basata su pattern (con cautela)#

A volte devi invalidare tutte le chiavi che corrispondono a un pattern. Non usare mai KEYS in produzione. Blocca il server Redis mentre scansiona l'intero keyspace. Con milioni di chiavi, può bloccare per secondi — un'eternità in termini Redis.

Usa SCAN invece:

typescript

async function invalidateByPattern(pattern: string): Promise<number> {
  let cursor = "0";
  let deletedCount = 0;
 
  do {
    const [nextCursor, keys] = await redis.scan(
      cursor,
      "MATCH",
      pattern,
      "COUNT",
      100
    );
    cursor = nextCursor;
 
    if (keys.length > 0) {
      await redis.del(...keys);
      deletedCount += keys.length;
    }
  } while (cursor !== "0");
 
  return deletedCount;
}
 
// Invalidate all cached data for a specific team
await invalidateByPattern("cache:team:42:*");

SCAN itera incrementalmente — non blocca mai il server. L'hint COUNT suggerisce quante chiavi restituire per iterazione (è un suggerimento, non una garanzia). Per keyspace grandi, questo è l'unico approccio sicuro.

Detto questo, l'invalidazione basata su pattern è un code smell. Se ti trovi a fare scan frequentemente, riprogetta la struttura delle chiavi o usa i tag. SCAN è O(N) sul keyspace ed è pensato per operazioni di manutenzione, non per hot path.

Strutture dati oltre le stringhe#

La maggior parte degli sviluppatori tratta Redis come un key-value store per stringhe JSON. È come comprare un coltellino svizzero e usare solo l'apribottiglie. Redis ha strutture dati ricche, e scegliere quella giusta può eliminare intere categorie di complessità.

Hash per gli oggetti#

Invece di serializzare un intero oggetto come JSON, memorizzalo come un Redis Hash. Questo ti permette di leggere e aggiornare singoli campi senza deserializzare l'intero oggetto.

typescript

// Store user as a hash
async function setUserHash(user: User): Promise<void> {
  const key = `user:${user.id}`;
  await redis.hset(key, {
    name: user.name,
    email: user.email,
    plan: user.plan,
    updatedAt: Date.now().toString(),
  });
  await redis.expire(key, 1800);
}
 
// Read specific fields
async function getUserPlan(userId: string): Promise<string | null> {
  return redis.hget(`user:${userId}`, "plan");
}
 
// Update a single field
async function upgradeUserPlan(
  userId: string,
  plan: string
): Promise<void> {
  await redis.hset(`user:${userId}`, "plan", plan);
}
 
// Read entire hash as object
async function getUserHash(userId: string): Promise<User | null> {
  const data = await redis.hgetall(`user:${userId}`);
  if (!data || Object.keys(data).length === 0) return null;
 
  return {
    id: userId,
    name: data.name,
    email: data.email,
    plan: data.plan as User["plan"],
  };
}

Gli Hash sono efficienti in memoria per oggetti piccoli (Redis usa una codifica ziplist compatta sotto il cofano) e evitano l'overhead di serializzazione/deserializzazione. Il compromesso: perdi la possibilità di memorizzare oggetti annidati senza appiattirli prima.

Sorted Set per classifiche e rate limiting#

I Sorted Set sono la struttura dati più sottovalutata di Redis. Ogni membro ha uno score, e l'insieme è sempre ordinato per score. Questo li rende perfetti per classifiche, ranking e rate limiting a finestra scorrevole.

typescript

// Leaderboard
async function addScore(
  leaderboard: string,
  userId: string,
  score: number
): Promise<void> {
  await redis.zadd(leaderboard, score, userId);
}
 
async function getTopPlayers(
  leaderboard: string,
  count: number = 10
): Promise<Array<{ userId: string; score: number }>> {
  const results = await redis.zrevrange(
    leaderboard,
    0,
    count - 1,
    "WITHSCORES"
  );
 
  const players: Array<{ userId: string; score: number }> = [];
  for (let i = 0; i < results.length; i += 2) {
    players.push({
      userId: results[i],
      score: parseFloat(results[i + 1]),
    });
  }
  return players;
}
 
async function getUserRank(
  leaderboard: string,
  userId: string
): Promise<number | null> {
  const rank = await redis.zrevrank(leaderboard, userId);
  return rank !== null ? rank + 1 : null; // 0-indexed to 1-indexed
}

Per il rate limiting a finestra scorrevole:

typescript

async function slidingWindowRateLimit(
  identifier: string,
  windowMs: number,
  maxRequests: number
): Promise<{ allowed: boolean; remaining: number }> {
  const key = `ratelimit:${identifier}`;
  const now = Date.now();
  const windowStart = now - windowMs;
 
  const pipeline = redis.pipeline();
 
  // Remove entries outside the window
  pipeline.zremrangebyscore(key, 0, windowStart);
 
  // Add current request
  pipeline.zadd(key, now, `${now}:${Math.random()}`);
 
  // Count requests in window
  pipeline.zcard(key);
 
  // Set expiry on the whole key
  pipeline.expire(key, Math.ceil(windowMs / 1000));
 
  const results = await pipeline.exec();
  const count = results?.[2]?.[1] as number;
 
  return {
    allowed: count <= maxRequests,
    remaining: Math.max(0, maxRequests - count),
  };
}

Questo è più preciso dell'approccio con contatore a finestra fissa e non soffre del problema di confine dove un burst alla fine di una finestra e all'inizio della successiva raddoppia effettivamente il tuo rate limit.

Liste per le code#

Le Redis List con LPUSH/BRPOP creano eccellenti code di job leggere:

typescript

interface Job {
  id: string;
  type: string;
  payload: Record<string, unknown>;
  createdAt: number;
}
 
// Producer
async function enqueueJob(
  queue: string,
  type: string,
  payload: Record<string, unknown>
): Promise<string> {
  const job: Job = {
    id: randomUUID(),
    type,
    payload,
    createdAt: Date.now(),
  };
 
  await redis.lpush(`queue:${queue}`, JSON.stringify(job));
  return job.id;
}
 
// Consumer (blocks until a job is available)
async function dequeueJob(
  queue: string,
  timeout: number = 5
): Promise<Job | null> {
  const result = await redis.brpop(`queue:${queue}`, timeout);
  if (!result) return null;
 
  return JSON.parse(result[1]) as Job;
}

Per qualsiasi cosa più complessa del queuing base (retry, dead letter queue, priorità, job ritardati), usa BullMQ che si basa su Redis ma gestisce tutti i casi limite.

Set per il tracciamento univoco#

Devi tracciare visitatori univoci, deduplicare eventi, o controllare l'appartenenza? I Set sono O(1) per aggiungere, rimuovere e controllare l'appartenenza.

typescript

// Track unique visitors per day
async function trackVisitor(
  page: string,
  visitorId: string
): Promise<boolean> {
  const key = `visitors:${page}:${new Date().toISOString().split("T")[0]}`;
  const isNew = await redis.sadd(key, visitorId);
 
  // Auto-expire after 48 hours
  await redis.expire(key, 172800);
 
  return isNew === 1; // 1 = new member, 0 = already existed
}
 
// Get unique visitor count
async function getUniqueVisitors(page: string, date: string): Promise<number> {
  return redis.scard(`visitors:${page}:${date}`);
}
 
// Check if user has already performed an action
async function hasUserVoted(pollId: string, userId: string): Promise<boolean> {
  return (await redis.sismember(`votes:${pollId}`, userId)) === 1;
}

Per set molto grandi (milioni di membri), considera HyperLogLog. Usa solo 12KB di memoria indipendentemente dalla cardinalità, al costo di un ~0.81% di errore standard:

typescript

// HyperLogLog for approximate unique counts
async function trackVisitorApprox(
  page: string,
  visitorId: string
): Promise<void> {
  const key = `hll:visitors:${page}:${new Date().toISOString().split("T")[0]}`;
  await redis.pfadd(key, visitorId);
  await redis.expire(key, 172800);
}
 
async function getApproxUniqueVisitors(
  page: string,
  date: string
): Promise<number> {
  return redis.pfcount(`hll:visitors:${page}:${date}`);
}

Serializzazione: JSON vs MessagePack#

JSON è la scelta predefinita per la serializzazione Redis. È leggibile, universale e abbastanza buono per la maggior parte dei casi. Ma per sistemi ad alto throughput, l'overhead di serializzazione/deserializzazione si accumula.

Il problema con JSON#

typescript

const user = {
  id: "usr_abc123",
  name: "Ahmet Kousa",
  email: "ahmet@example.com",
  plan: "pro",
  preferences: {
    theme: "dark",
    language: "tr",
    notifications: true,
  },
};
 
// JSON: 189 bytes
const jsonStr = JSON.stringify(user);
console.log(Buffer.byteLength(jsonStr)); // 189
 
// JSON.parse on a hot path: ~0.02ms per call
// At 10,000 requests/sec: 200ms total CPU time per second

Alternativa MessagePack#

MessagePack è un formato di serializzazione binaria più piccolo e veloce di JSON:

bash

npm install msgpackr

typescript

import { pack, unpack } from "msgpackr";
 
// MessagePack: ~140 bytes (25% smaller)
const packed = pack(user);
console.log(packed.length); // ~140
 
// Store as Buffer
await redis.set("user:123", packed);
 
// Read as Buffer
const raw = await redis.getBuffer("user:123");
if (raw) {
  const data = unpack(raw);
}

Nota l'uso di getBuffer invece di get — questo è fondamentale. get restituisce una stringa e corromperebbe i dati binari.

Compressione per valori grandi#

Per valori in cache grandi (risposte API con centinaia di elementi, HTML renderizzato), aggiungi la compressione:

typescript

import { promisify } from "util";
import { gzip, gunzip } from "zlib";
 
const gzipAsync = promisify(gzip);
const gunzipAsync = promisify(gunzip);
 
async function setCompressed<T>(
  key: string,
  value: T,
  ttl: number
): Promise<void> {
  const json = JSON.stringify(value);
 
  // Only compress if larger than 1KB (compression overhead isn't worth it for small values)
  if (Buffer.byteLength(json) > 1024) {
    const compressed = await gzipAsync(json);
    await redis.set(key, compressed, "EX", ttl);
  } else {
    await redis.set(key, json, "EX", ttl);
  }
}
 
async function getCompressed<T>(key: string): Promise<T | null> {
  const raw = await redis.getBuffer(key);
  if (!raw) return null;
 
  try {
    // Try to decompress first
    const decompressed = await gunzipAsync(raw);
    return JSON.parse(decompressed.toString()) as T;
  } catch {
    // Not compressed, parse as regular JSON
    return JSON.parse(raw.toString()) as T;
  }
}

Nei miei test, la compressione gzip riduce tipicamente la dimensione del payload JSON del 70-85%. Una risposta API da 50KB diventa 8KB. Questo conta quando paghi per la memoria Redis — meno memoria per chiave significa più chiavi nella stessa istanza.

Il compromesso: la compressione aggiunge 1-3ms di tempo CPU per operazione. Per la maggior parte delle applicazioni, questo è trascurabile. Per percorsi a latenza ultra-bassa, saltala.

La mia raccomandazione#

Usa JSON a meno che il profiling non mostri che è un collo di bottiglia. La leggibilità e debuggabilità di JSON in Redis (puoi fare redis-cli GET key e leggere effettivamente il valore) supera il guadagno prestazionale di MessagePack per il 95% delle applicazioni. Aggiungi la compressione solo per valori più grandi di 1KB.

Redis in Next.js#

Next.js ha la sua storia di caching (Data Cache, Full Route Cache, ecc.), ma Redis riempie le lacune che il caching integrato non può gestire — specialmente quando devi condividere la cache tra più istanze o persistere la cache tra i deploy.

Caching delle risposte API Route#

typescript

// app/api/products/route.ts
import { NextResponse } from "next/server";
import redis from "@/lib/redis";
 
export async function GET(request: Request) {
  const url = new URL(request.url);
  const category = url.searchParams.get("category") || "all";
  const cacheKey = `api:products:${category}`;
 
  // Check cache
  const cached = await redis.get(cacheKey);
  if (cached) {
    return NextResponse.json(JSON.parse(cached), {
      headers: {
        "X-Cache": "HIT",
        "Cache-Control": "public, s-maxage=60",
      },
    });
  }
 
  // Fetch from database
  const products = await db.products.findMany({
    where: category !== "all" ? { category } : undefined,
    orderBy: { createdAt: "desc" },
    take: 50,
  });
 
  // Cache for 5 minutes with jitter
  await redis.set(
    cacheKey,
    JSON.stringify(products),
    "EX",
    ttlWithJitter(300)
  );
 
  return NextResponse.json(products, {
    headers: {
      "X-Cache": "MISS",
      "Cache-Control": "public, s-maxage=60",
    },
  });
}

L'header X-Cache è inestimabile per il debug. Quando la latenza impenna, un rapido curl -I ti dice se la cache sta funzionando.

Storage delle sessioni#

Next.js con Redis per le sessioni batte JWT per le applicazioni stateful:

typescript

// lib/session.ts
import { randomUUID } from "crypto";
import redis from "./redis";
 
interface Session {
  userId: string;
  role: string;
  createdAt: number;
  data: Record<string, unknown>;
}
 
const SESSION_TTL = 86400; // 24 hours
const SESSION_PREFIX = "session:";
 
export async function createSession(
  userId: string,
  role: string
): Promise<string> {
  const sessionId = randomUUID();
  const session: Session = {
    userId,
    role,
    createdAt: Date.now(),
    data: {},
  };
 
  await redis.set(
    `${SESSION_PREFIX}${sessionId}`,
    JSON.stringify(session),
    "EX",
    SESSION_TTL
  );
 
  return sessionId;
}
 
export async function getSession(
  sessionId: string
): Promise<Session | null> {
  const key = `${SESSION_PREFIX}${sessionId}`;
 
  // Use GETEX to refresh TTL on every access (sliding expiry)
  const raw = await redis.getex(key, "EX", SESSION_TTL);
  if (!raw) return null;
 
  return JSON.parse(raw) as Session;
}
 
export async function destroySession(sessionId: string): Promise<void> {
  await redis.del(`${SESSION_PREFIX}${sessionId}`);
}
 
// Destroy all sessions for a user (useful for "logout everywhere")
export async function destroyAllUserSessions(
  userId: string
): Promise<void> {
  // This requires maintaining a user->sessions index
  const sessionIds = await redis.smembers(`user_sessions:${userId}`);
 
  if (sessionIds.length > 0) {
    const pipeline = redis.pipeline();
    for (const sid of sessionIds) {
      pipeline.del(`${SESSION_PREFIX}${sid}`);
    }
    pipeline.del(`user_sessions:${userId}`);
    await pipeline.exec();
  }
}

Middleware di rate limiting#

typescript

// middleware.ts (or a helper used by middleware)
import redis from "@/lib/redis";
 
interface RateLimitResult {
  allowed: boolean;
  remaining: number;
  resetAt: number;
}
 
export async function rateLimit(
  identifier: string,
  limit: number = 60,
  windowSeconds: number = 60
): Promise<RateLimitResult> {
  const key = `rate:${identifier}`;
  const now = Math.floor(Date.now() / 1000);
  const windowStart = now - windowSeconds;
 
  // Lua script for atomic rate limiting
  const script = `
    redis.call('ZREMRANGEBYSCORE', KEYS[1], 0, ARGV[1])
    redis.call('ZADD', KEYS[1], ARGV[2], ARGV[3])
    local count = redis.call('ZCARD', KEYS[1])
    redis.call('EXPIRE', KEYS[1], ARGV[4])
    return count
  `;
 
  const count = (await redis.eval(
    script,
    1,
    key,
    windowStart,
    now,
    `${now}:${Math.random()}`,
    windowSeconds
  )) as number;
 
  return {
    allowed: count <= limit,
    remaining: Math.max(0, limit - count),
    resetAt: now + windowSeconds,
  };
}

Lo script Lua è importante qui. Senza di esso, la sequenza ZREMRANGEBYSCORE + ZADD + ZCARD non è atomica, e sotto alta concorrenza, il conteggio potrebbe essere impreciso. Gli script Lua vengono eseguiti atomicamente in Redis — non possono essere interleaved con altri comandi.

Lock distribuiti per Next.js#

Quando hai più istanze Next.js e devi assicurarti che solo una processi un task (come inviare un'email programmata o eseguire un job di pulizia):

typescript

// lib/distributed-lock.ts
import { randomUUID } from "crypto";
import redis from "./redis";
 
export async function withLock<T>(
  lockName: string,
  fn: () => Promise<T>,
  options: { ttl?: number; retryDelay?: number; maxRetries?: number } = {}
): Promise<T | null> {
  const { ttl = 30, retryDelay = 200, maxRetries = 10 } = options;
  const token = randomUUID();
  const lockKey = `dlock:${lockName}`;
 
  // Try to acquire lock
  for (let attempt = 0; attempt < maxRetries; attempt++) {
    const acquired = await redis.set(lockKey, token, "EX", ttl, "NX");
 
    if (acquired) {
      try {
        // Extend lock automatically for long-running tasks
        const extender = setInterval(async () => {
          const script = `
            if redis.call("get", KEYS[1]) == ARGV[1] then
              return redis.call("expire", KEYS[1], ARGV[2])
            else
              return 0
            end
          `;
          await redis.eval(script, 1, lockKey, token, ttl);
        }, (ttl * 1000) / 3);
 
        const result = await fn();
        clearInterval(extender);
        return result;
      } finally {
        // Release lock only if we still own it
        const releaseScript = `
          if redis.call("get", KEYS[1]) == ARGV[1] then
            return redis.call("del", KEYS[1])
          else
            return 0
          end
        `;
        await redis.eval(releaseScript, 1, lockKey, token);
      }
    }
 
    // Wait before retrying
    await new Promise((r) => setTimeout(r, retryDelay));
  }
 
  // Could not acquire lock after all retries
  return null;
}

Utilizzo:

typescript

// In a cron-triggered API route
export async function POST() {
  const result = await withLock("daily-report", async () => {
    // Only one instance runs this
    const report = await generateDailyReport();
    await sendReportEmail(report);
    return report;
  });
 
  if (result === null) {
    return NextResponse.json(
      { message: "Another instance is already processing" },
      { status: 409 }
    );
  }
 
  return NextResponse.json({ success: true });
}

L'intervallo di estensione del lock a ttl/3 è importante. Senza di esso, se il tuo task impiega più del TTL del lock, il lock scade e un'altra istanza lo prende. L'extender mantiene il lock attivo finché il task è in esecuzione.

Monitoraggio e debug#

Redis è veloce fino a quando non lo è più. Quando i problemi colpiscono, hai bisogno di visibilità.

Cache hit ratio#

La metrica singola più importante. Tracciala nella tua applicazione:

typescript

// lib/cache-metrics.ts
import redis from "./redis";
 
const METRICS_KEY = "metrics:cache";
 
export async function recordCacheHit(): Promise<void> {
  await redis.hincrby(METRICS_KEY, "hits", 1);
}
 
export async function recordCacheMiss(): Promise<void> {
  await redis.hincrby(METRICS_KEY, "misses", 1);
}
 
export async function getCacheStats(): Promise<{
  hits: number;
  misses: number;
  hitRate: number;
}> {
  const stats = await redis.hgetall(METRICS_KEY);
  const hits = parseInt(stats.hits || "0", 10);
  const misses = parseInt(stats.misses || "0", 10);
  const total = hits + misses;
 
  return {
    hits,
    misses,
    hitRate: total > 0 ? hits / total : 0,
  };
}
 
// Reset metrics daily
export async function resetCacheStats(): Promise<void> {
  await redis.del(METRICS_KEY);
}

Un cache hit ratio sano è sopra il 90%. Se sei sotto l'80%, o i tuoi TTL sono troppo brevi, le tue chiavi di cache sono troppo specifiche, o i tuoi pattern di accesso sono più casuali di quanto pensassi.

Comando INFO#

Il comando INFO è il dashboard di salute integrato di Redis:

bash

redis-cli INFO memory

# Memory
used_memory:1234567
used_memory_human:1.18M
used_memory_peak:2345678
used_memory_peak_human:2.24M
maxmemory:0
maxmemory_policy:noeviction
mem_fragmentation_ratio:1.23

Metriche chiave da monitorare:

used_memory vs maxmemory: Ti stai avvicinando al limite?
mem_fragmentation_ratio: Sopra 1.5 significa che Redis sta usando significativamente più RSS della memoria logica. Considera un riavvio.
evicted_keys: Se questo è diverso da zero e non intendevi eviction, sei fuori memoria.

bash

redis-cli INFO stats

Osserva:

keyspace_hits / keyspace_misses: Hit rate a livello server
total_commands_processed: Throughput
instantaneous_ops_per_sec: Throughput corrente

MONITOR (usa con estrema cautela)#

MONITOR trasmette in streaming ogni comando eseguito sul server Redis in tempo reale. È incredibilmente utile per il debug e incredibilmente pericoloso in produzione.

bash

# NEVER leave this running in production
# It adds significant overhead and can log sensitive data
redis-cli MONITOR

1614556800.123456 [0 127.0.0.1:52340] "SET" "cache:user:123" "{\"name\":\"Ahmet\"}" "EX" "1800"
1614556800.234567 [0 127.0.0.1:52340] "GET" "cache:user:456"

Uso MONITOR esattamente per due cose: debuggare problemi di naming delle chiavi durante lo sviluppo, e verificare che un percorso di codice specifico stia effettivamente colpendo Redis come previsto. Mai per più di 30 secondi. Mai in produzione a meno che tu non abbia già esaurito altre opzioni di debug.

Notifiche del keyspace#

Vuoi sapere quando le chiavi scadono o vengono cancellate? Redis può pubblicare eventi:

bash

# Enable keyspace notifications for expired and evicted events
redis-cli CONFIG SET notify-keyspace-events Ex

typescript

const subscriber = new Redis(/* config */);
 
// Listen for key expiration events
subscriber.subscribe("__keyevent@0__:expired", (err) => {
  if (err) console.error("Subscribe error:", err);
});
 
subscriber.on("message", (_channel, expiredKey) => {
  console.log(`Key expired: ${expiredKey}`);
 
  // Proactively regenerate important keys
  if (expiredKey.startsWith("cache:homepage")) {
    regenerateHomepageCache().catch(console.error);
  }
});

Questo è utile per il cache warming proattivo — invece di aspettare che un utente attivi un cache miss, rigeneri le voci critiche nel momento in cui scadono.

Analisi della memoria#

Quando la memoria Redis cresce inaspettatamente, devi trovare quali chiavi stanno consumando di più:

bash

# Sample 10 largest keys
redis-cli --bigkeys

# Scanning the entire keyspace to find biggest keys
[00.00%] Biggest string found so far '"cache:search:electronics"' with 524288 bytes
[25.00%] Biggest zset found so far '"leaderboard:global"' with 150000 members
[50.00%] Biggest hash found so far '"session:abc123"' with 45 fields

Per un'analisi più dettagliata:

bash

# Memory usage of a specific key (in bytes)
redis-cli MEMORY USAGE "cache:search:electronics"

typescript

// Programmatic memory analysis
async function analyzeMemory(pattern: string): Promise<void> {
  let cursor = "0";
  const stats: Array<{ key: string; bytes: number }> = [];
 
  do {
    const [nextCursor, keys] = await redis.scan(
      cursor,
      "MATCH",
      pattern,
      "COUNT",
      100
    );
    cursor = nextCursor;
 
    for (const key of keys) {
      const bytes = await redis.memory("USAGE", key);
      if (bytes) {
        stats.push({ key, bytes: bytes as number });
      }
    }
  } while (cursor !== "0");
 
  // Sort by size descending
  stats.sort((a, b) => b.bytes - a.bytes);
 
  console.log("Top 20 keys by memory usage:");
  for (const { key, bytes } of stats.slice(0, 20)) {
    const mb = (bytes / 1024 / 1024).toFixed(2);
    console.log(`  ${key}: ${mb} MB`);
  }
}

Policy di eviction#

Se la tua istanza Redis ha un limite maxmemory (dovrebbe), configura una policy di eviction:

bash

# In redis.conf or via CONFIG SET
maxmemory 512mb
maxmemory-policy allkeys-lru

Policy disponibili:

noeviction: Restituisce errore quando la memoria è piena (default, la peggiore per il caching)
allkeys-lru: Evict la chiave usata meno di recente (la migliore scelta general-purpose per il caching)
allkeys-lfu: Evict la chiave usata meno frequentemente (migliore se alcune chiavi sono accedute a raffica)
volatile-lru: Evict solo le chiavi con TTL impostato (utile se mescoli cache e dati persistenti)
allkeys-random: Eviction casuale (sorprendentemente decente, nessun overhead)

Per workload di puro caching, allkeys-lfu è di solito la scelta migliore. Mantiene in memoria le chiavi accedute frequentemente anche se non sono state accedute di recente.

Mettere tutto insieme: un modulo cache di produzione#

Ecco il modulo cache completo che uso in produzione, combinando tutto ciò di cui abbiamo discusso:

typescript

// lib/cache.ts
import Redis from "ioredis";
 
const redis = new Redis({
  host: process.env.REDIS_HOST || "127.0.0.1",
  port: Number(process.env.REDIS_PORT) || 6379,
  password: process.env.REDIS_PASSWORD || undefined,
  maxRetriesPerRequest: 3,
  retryStrategy(times) {
    return Math.min(times * 200, 5000);
  },
});
 
// TTL tiers
const TTL = {
  STATIC: 86400,
  MODERATE: 1800,
  VOLATILE: 300,
  EPHEMERAL: 60,
  NOT_FOUND: 120,
} as const;
 
type TTLTier = keyof typeof TTL;
 
function ttlWithJitter(base: number, jitter = 0.1): number {
  const offset = base * jitter * (Math.random() * 2 - 1);
  return Math.max(1, Math.round(base + offset));
}
 
// Core cache-aside with stampede protection
async function get<T>(
  key: string,
  fetcher: () => Promise<T>,
  options: {
    tier?: TTLTier;
    ttl?: number;
    tags?: string[];
    swr?: { freshTtl: number; staleTtl: number };
  } = {}
): Promise<T> {
  const { tier = "MODERATE", tags } = options;
  const baseTtl = options.ttl ?? TTL[tier];
  const cacheKey = `c:${key}`;
 
  // Check cache
  const cached = await redis.get(cacheKey);
 
  if (cached !== null) {
    try {
      const parsed = JSON.parse(cached);
      recordHit();
      return parsed as T;
    } catch {
      await redis.del(cacheKey);
    }
  }
 
  recordMiss();
 
  // Acquire lock to prevent stampede
  const lockKey = `lock:${key}`;
  const acquired = await redis.set(lockKey, "1", "EX", 10, "NX");
 
  if (!acquired) {
    // Another process is fetching — wait briefly and retry cache
    await new Promise((r) => setTimeout(r, 150));
    const retried = await redis.get(cacheKey);
    if (retried) return JSON.parse(retried) as T;
  }
 
  try {
    const result = await fetcher();
    const ttl = ttlWithJitter(baseTtl);
 
    const pipeline = redis.pipeline();
    pipeline.set(cacheKey, JSON.stringify(result), "EX", ttl);
 
    // Store tag associations
    if (tags) {
      for (const tag of tags) {
        pipeline.sadd(`tag:${tag}`, cacheKey);
        pipeline.expire(`tag:${tag}`, ttl + 3600);
      }
    }
 
    await pipeline.exec();
    return result;
  } finally {
    await redis.del(lockKey);
  }
}
 
// Invalidation
async function invalidate(...keys: string[]): Promise<void> {
  if (keys.length === 0) return;
  await redis.del(...keys.map((k) => `c:${k}`));
}
 
async function invalidateByTag(tag: string): Promise<number> {
  const keys = await redis.smembers(`tag:${tag}`);
  if (keys.length === 0) return 0;
 
  const pipeline = redis.pipeline();
  for (const key of keys) {
    pipeline.del(key);
  }
  pipeline.del(`tag:${tag}`);
  await pipeline.exec();
  return keys.length;
}
 
// Metrics
function recordHit(): void {
  redis.hincrby("metrics:cache", "hits", 1).catch(() => {});
}
 
function recordMiss(): void {
  redis.hincrby("metrics:cache", "misses", 1).catch(() => {});
}
 
async function stats(): Promise<{
  hits: number;
  misses: number;
  hitRate: string;
}> {
  const raw = await redis.hgetall("metrics:cache");
  const hits = parseInt(raw.hits || "0", 10);
  const misses = parseInt(raw.misses || "0", 10);
  const total = hits + misses;
 
  return {
    hits,
    misses,
    hitRate: total > 0 ? ((hits / total) * 100).toFixed(1) + "%" : "N/A",
  };
}
 
export const cache = {
  get,
  invalidate,
  invalidateByTag,
  stats,
  redis,
  TTL,
};

Utilizzo in tutta l'applicazione:

typescript

import { cache } from "@/lib/cache";
 
// Simple cache-aside
const products = await cache.get("products:featured", fetchFeaturedProducts, {
  tier: "VOLATILE",
  tags: ["entity:products"],
});
 
// With custom TTL
const config = await cache.get("app:config", fetchAppConfig, {
  ttl: 43200, // 12 hours
});
 
// After updating a product
await cache.invalidateByTag("entity:products");
 
// Check health
const metrics = await cache.stats();
console.log(`Cache hit rate: ${metrics.hitRate}`);

Errori comuni che ho fatto (così non li fai tu)#

1. Non impostare maxmemory. Redis userà felicemente tutta la memoria disponibile fino a quando il sistema operativo non lo uccide. Imposta sempre un limite.

2. Usare KEYS in produzione. Blocca il server. Usa SCAN. L'ho imparato quando una chiamata KEYS * da uno script di monitoraggio ha causato 3 secondi di downtime.

3. Fare cache in modo troppo aggressivo. Non tutto ha bisogno di essere in cache. Se la tua query al database impiega 2ms e viene chiamata 10 volte al minuto, il caching aggiunge complessità per un beneficio trascurabile.

4. Ignorare i costi di serializzazione. Una volta ho messo in cache un blob JSON da 2MB e mi chiedevo perché le letture dalla cache fossero lente. L'overhead di serializzazione era più grande della query al database che doveva risparmiare.

5. Nessuna degradazione graduale. Quando Redis va giù, la tua app dovrebbe comunque funzionare — solo più lentamente. Avvolgi ogni chiamata alla cache in un try/catch che faccia fallback al database. Non permettere mai che un fallimento della cache diventi un errore visibile all'utente.

typescript

async function resilientGet<T>(
  key: string,
  fetcher: () => Promise<T>
): Promise<T> {
  try {
    return await cache.get(key, fetcher);
  } catch (err) {
    console.error(`[Cache] Degraded mode for ${key}:`, err);
    return fetcher(); // Bypass cache entirely
  }
}

6. Non monitorare le eviction. Se Redis sta facendo eviction delle chiavi, sei o sotto-provvisionato o stai mettendo troppo in cache. In entrambi i casi, devi saperlo.

7. Condividere un'istanza Redis tra caching e dati persistenti. Usa istanze separate (o almeno database separati). Una policy di eviction della cache che cancella le voci della tua job queue è una brutta giornata per tutti.

Conclusione#

Il caching con Redis non è difficile, ma è facile sbagliare. Inizia con cache-aside, aggiungi il TTL jitter dal primo giorno, monitora il tuo hit rate, e resisti alla tentazione di mettere tutto in cache.

La migliore strategia di caching è quella su cui riesci a ragionare alle 3 di notte quando qualcosa si rompe. Mantienila semplice, mantienila osservabile, e ricorda che ogni valore in cache è una bugia che hai raccontato ai tuoi utenti sullo stato dei tuoi dati — il tuo lavoro è mantenere quella bugia il più piccola e breve possibile.

Perché fare caching?#

Configurare ioredis#

Pattern Cache-Aside#

Write-Through e Write-Behind#

Write-Through#

Write-Behind (Write-Back)#

Strategia TTL#

Livelli di volatilità dei dati#

TTL Jitter: prevenire il Thundering Herd#

Sliding Expiry#

Cache Stampede (Thundering Herd)#

Soluzione 1: Mutex Lock#

Soluzione 2: Scadenza anticipata probabilistica#

Soluzione 3: Stale-While-Revalidate#

Invalidazione della cache#

Invalidazione semplice basata su chiave#

Invalidazione basata su tag#

Invalidazione event-driven#

Invalidazione basata su pattern (con cautela)#

Strutture dati oltre le stringhe#

Hash per gli oggetti#

Sorted Set per classifiche e rate limiting#

Liste per le code#

Set per il tracciamento univoco#

Serializzazione: JSON vs MessagePack#

Il problema con JSON#

Alternativa MessagePack#

Compressione per valori grandi#

La mia raccomandazione#

Redis in Next.js#

Caching delle risposte API Route#

Storage delle sessioni#

Middleware di rate limiting#

Lock distribuiti per Next.js#

Monitoraggio e debug#

Cache hit ratio#

Comando INFO#

MONITOR (usa con estrema cautela)#

Notifiche del keyspace#

Analisi della memoria#

Policy di eviction#

Mettere tutto insieme: un modulo cache di produzione#

Errori comuni che ho fatto (così non li fai tu)#

Conclusione#

Articoli correlati

Best practice per la sicurezza delle API: la checklist che eseguo su ogni progetto

Autenticazione moderna nel 2026: JWT, sessioni, OAuth e passkey