The engineering reality behind AI agents in 2026. Tool orchestration, memory systems, planning loops, guardrails, cost control, and the architecture patterns that separate demo agents from production agents.
Every week, someone on Twitter posts a thread about their AI agent that "autonomously" does something impressive. Books a flight. Writes and deploys code. Manages their email. The demo looks magical. The replies are full of fire emojis and "the future is here."
What nobody shows you is the version that ran before the recording. The one that booked three flights to the wrong city. The one that deployed code with a SQL injection vulnerability. The one that replied to the CEO's email with a hallucinated quarterly report. AI agents are easy to demo and brutally hard to ship.
I have spent the last eight months building AI agents for production use — not toy demos, not proof of concepts, but agents that handle real tasks for real users with real consequences when they fail. What follows is everything I have learned about the architecture, the patterns, the failure modes, and the engineering discipline required to build agents that actually work.
This is not a tutorial where we build a chatbot and call it an agent. This is the hard stuff. The stuff that determines whether your agent is a product or a party trick.
Let me start with precision, because the industry has made "agent" meaningless through overuse. Every chatbot, every RAG pipeline, every wrapper around a single API call now calls itself an agent. That is marketing, not engineering.
An AI agent, in the engineering sense, has four properties:
A chatbot has none of these. A RAG pipeline has maybe one (tool use, if you squint). A function-calling LLM has tool use but not autonomy or planning. A true agent has all four, and the interaction between these properties is where all the complexity lives.
Here is the fundamental architecture:
┌─────────────────────────────────────────────┐
│ AGENT LOOP │
│ │
│ ┌──────────┐ ┌──────────┐ ┌────────┐ │
│ │ OBSERVE │───▶│ THINK │───▶│ ACT │ │
│ │ │ │ (Plan) │ │ (Tool) │ │
│ └──────────┘ └──────────┘ └────────┘ │
│ ▲ │ │
│ │ ┌──────────┐ │ │
│ └─────────│ MEMORY │◀─────────┘ │
│ └──────────┘ │
│ │
│ ┌──────────────────────────────────────┐ │
│ │ GUARDRAILS │ │
│ │ Budget · Safety · Scope · Timeout │ │
│ └──────────────────────────────────────┘ │
└─────────────────────────────────────────────┘
Observe → Think → Act → update Memory → repeat. That is the loop. Every agent, from the simplest to the most sophisticated, is a variation on this loop. The differences are in how you implement each step and — critically — how you constrain the loop so it does not run forever, spend all your money, or do something catastrophic.
Let me build this from the ground up. Here is the simplest possible agent loop in TypeScript:
// src/lib/agent/core.ts
import { generateText } from "ai";
import { getModel } from "@/lib/ai/provider";
type Tool = {
name: string;
description: string;
parameters: Record<string, unknown>;
execute: (params: Record<string, unknown>) => Promise<string>;
};
type AgentConfig = {
systemPrompt: string;
tools: Tool[];
maxIterations: number;
model: string;
};
type AgentStep = {
thought: string;
toolName: string | null;
toolInput: Record<string, unknown> | null;
toolOutput: string | null;
timestamp: number;
};
export async function runAgent(
config: AgentConfig,
userGoal: string
): Promise<{ result: string; steps: AgentStep[] }> {
const steps: AgentStep[] = [];
let iterations = 0;
const messages: Array<{ role: string; content: string }> = [
{ role: "system", content: config.systemPrompt },
{ role: "user", content: userGoal },
];
while (iterations < config.maxIterations) {
iterations++;
const response = await generateText({
model: getModel(config.model),
messages,
tools: formatToolsForLLM(config.tools),
toolChoice: "auto",
});
// If the model chose not to use a tool, it is done
if (!response.toolCalls || response.toolCalls.length === 0) {
return {
result: response.text,
steps,
};
}
// Execute the tool call
for (const toolCall of response.toolCalls) {
const tool = config.tools.find((t) => t.name === toolCall.toolName);
if (!tool) {
throw new AgentError(`Unknown tool: ${toolCall.toolName}`);
}
const output = await tool.execute(toolCall.args);
steps.push({
thought: response.text || "",
toolName: toolCall.toolName,
toolInput: toolCall.args,
toolOutput: output,
timestamp: Date.now(),
});
// Feed the result back into the conversation
messages.push({
role: "assistant",
content: response.text || "",
});
messages.push({
role: "tool",
content: output,
});
}
}
throw new AgentError(
`Agent exceeded maximum iterations (${config.maxIterations})`
);
}This is maybe 80 lines. It works for demos. And it will absolutely destroy you in production. Let me count the ways.
Problem 1: No cost tracking. Every iteration burns tokens. A runaway agent can spend $50 in minutes if the context window keeps growing.
Problem 2: No timeout. If a tool call hangs, the entire agent hangs. If the LLM takes 30 seconds per iteration and you allow 20 iterations, that is 10 minutes of wall-clock time with no feedback.
Problem 3: No guardrails. The agent can call any tool with any parameters. If one of your tools is execute_sql, congratulations — you have given an LLM unrestricted database access.
Problem 4: No error recovery. One failed tool call crashes the entire run. Production systems fail constantly, and an agent that cannot handle a 503 from an API is useless.
Problem 5: Unbounded memory. The message array grows with every iteration. After 10 iterations with verbose tool outputs, you might exceed the context window.
Let me fix all of these.
// src/lib/agent/production-loop.ts
import { generateText } from "ai";
import { getModel } from "@/lib/ai/provider";
import { estimateTokens } from "@/lib/ai/tokens";
type AgentOptions = {
config: AgentConfig;
userGoal: string;
budget: TokenBudget;
timeout: number;
onStep?: (step: AgentStep) => void;
abortSignal?: AbortSignal;
};
type TokenBudget = {
maxInputTokens: number;
maxOutputTokens: number;
maxTotalCost: number; // in dollars
costPerInputToken: number;
costPerOutputToken: number;
};
type AgentResult = {
result: string;
steps: AgentStep[];
totalTokens: { input: number; output: number };
totalCost: number;
durationMs: number;
};
export async function runProductionAgent(
options: AgentOptions
): Promise<AgentResult> {
const { config, userGoal, budget, timeout, onStep, abortSignal } =
options;
const startTime = Date.now();
const steps: AgentStep[] = [];
let totalInputTokens = 0;
let totalOutputTokens = 0;
let iterations = 0;
// Working memory — we will compress this as needed
const messages: Message[] = [
{ role: "system", content: config.systemPrompt },
{ role: "user", content: userGoal },
];
while (iterations < config.maxIterations) {
// Guard 1: Timeout
const elapsed = Date.now() - startTime;
if (elapsed > timeout) {
return finalize("Agent timed out. Here is what I accomplished so far.",
steps, totalInputTokens, totalOutputTokens, budget, startTime);
}
// Guard 2: Abort signal (user cancelled)
if (abortSignal?.aborted) {
return finalize("Agent was cancelled.",
steps, totalInputTokens, totalOutputTokens, budget, startTime);
}
// Guard 3: Budget
const currentCost = calculateCost(
totalInputTokens, totalOutputTokens, budget
);
if (currentCost >= budget.maxTotalCost) {
return finalize(
"Budget limit reached. Here is what I accomplished so far.",
steps, totalInputTokens, totalOutputTokens, budget, startTime
);
}
// Guard 4: Context window management
const contextTokens = estimateTokens(
messages.map((m) => m.content).join("\n")
);
if (contextTokens > budget.maxInputTokens * 0.8) {
// Compress older messages to stay within budget
compressMessages(messages, budget.maxInputTokens * 0.5);
}
iterations++;
try {
const response = await generateTextWithTimeout(
{
model: getModel(config.model),
messages,
tools: formatToolsForLLM(config.tools),
toolChoice: iterations >= config.maxIterations - 1
? "none" // Force a final answer on last iteration
: "auto",
abortSignal,
},
30_000 // 30s timeout per LLM call
);
totalInputTokens += response.usage?.promptTokens ?? 0;
totalOutputTokens += response.usage?.completionTokens ?? 0;
// No tool call — agent is done
if (!response.toolCalls || response.toolCalls.length === 0) {
return finalize(response.text, steps,
totalInputTokens, totalOutputTokens, budget, startTime);
}
// Execute tool calls with individual error handling
for (const toolCall of response.toolCalls) {
const step = await executeToolSafely(
toolCall, config.tools, config.toolTimeout ?? 15_000
);
steps.push(step);
onStep?.(step);
messages.push({
role: "assistant",
content: response.text || `Using tool: ${toolCall.toolName}`,
});
messages.push({
role: "tool",
content: step.toolOutput ?? "Tool returned no output.",
});
}
} catch (error) {
// LLM call itself failed — retry with backoff, not crash
if (iterations < config.maxIterations) {
messages.push({
role: "system",
content: `Previous step failed: ${(error as Error).message}.
Please try a different approach.`,
});
continue;
}
throw error;
}
}
return finalize(
"Reached maximum iterations. Here is my best answer based on the work done.",
steps, totalInputTokens, totalOutputTokens, budget, startTime
);
}
async function executeToolSafely(
toolCall: ToolCall,
tools: Tool[],
timeoutMs: number
): Promise<AgentStep> {
const tool = tools.find((t) => t.name === toolCall.toolName);
const timestamp = Date.now();
if (!tool) {
return {
thought: "",
toolName: toolCall.toolName,
toolInput: toolCall.args,
toolOutput: `Error: Unknown tool "${toolCall.toolName}". Available tools: ${tools.map((t) => t.name).join(", ")}`,
timestamp,
status: "error",
};
}
try {
const output = await withTimeout(
tool.execute(toolCall.args),
timeoutMs
);
return {
thought: "",
toolName: toolCall.toolName,
toolInput: toolCall.args,
toolOutput: truncateOutput(output, 4000),
timestamp,
status: "success",
};
} catch (error) {
return {
thought: "",
toolName: toolCall.toolName,
toolInput: toolCall.args,
toolOutput: `Tool execution failed: ${(error as Error).message}`,
timestamp,
status: "error",
};
}
}Let me walk through what changed and why each change matters.
The demo agent throws an exception when something goes wrong. The production agent returns a partial result. This distinction is fundamental. Users would rather get 70% of an answer than an error message. When the budget runs out, when the timeout hits, when the maximum iterations are reached — the agent summarizes what it accomplished so far and returns.
This is the piece that trips up almost everyone who builds agents. Each iteration adds messages to the context — the LLM's response, the tool output, sometimes multiple tool calls. After 8-10 iterations, a verbose agent can easily fill a 128K context window. And before you hit the hard limit, quality degrades because LLMs lose focus in very long contexts.
The compressMessages function summarizes older conversation turns:
function compressMessages(messages: Message[], targetTokens: number) {
// Keep the system prompt and the original user goal (first 2 messages)
// Keep the last 4 messages (most recent context)
// Summarize everything in between
const preserved = [
...messages.slice(0, 2), // system + user goal
...messages.slice(-4), // recent context
];
const middle = messages.slice(2, -4);
if (middle.length === 0) return;
const summary = summarizeSteps(middle);
messages.length = 0;
messages.push(
preserved[0], // system prompt
preserved[1], // user goal
{
role: "system",
content: `Summary of previous steps:\n${summary}`,
},
...preserved.slice(2), // recent messages
);
}This is a simple compression strategy. More sophisticated approaches include using a smaller, cheaper model to generate the summary, or maintaining a separate "working memory" document that the agent can read and write to. I will cover advanced memory architectures later.
Notice this line:
toolChoice: iterations >= config.maxIterations - 1
? "none" // Force a final answer on last iteration
: "auto",On the second-to-last iteration, we tell the LLM it cannot use tools — it must give a final answer. Without this, agents have a maddening tendency to request "one more" tool call right at the iteration limit, leading to an abrupt cutoff with no useful output. Forcing a final answer ensures the user always gets a coherent response.
The tools you give an agent are more important than the model you use. I am not exaggerating. A mediocre model with well-designed tools will outperform a frontier model with poorly designed tools, every single time. Here is why.
When an LLM decides which tool to use, it reads the tool's name, description, and parameter schema. That is all the information it has. If your tool descriptions are vague, the LLM will misuse them. If your parameter schemas are ambiguous, the LLM will pass wrong arguments. If you have too many tools, the LLM will be paralyzed by choice.
Rule 1: Fewer tools, better descriptions. Every tool you add increases the decision space the LLM must navigate. I have found that agents work best with 5-12 tools. Below 5, the agent is too limited. Above 12, tool selection accuracy drops noticeably.
If you need more capabilities, compose them. Instead of 20 individual database tools, create one query_database tool with a well-typed operation parameter:
const databaseTool: Tool = {
name: "query_database",
description: `Execute a read-only database query. Supports SELECT
queries only. Returns up to 100 rows as JSON. Use this to look up
user data, check records, or gather information needed to complete
the task. NEVER use this for INSERT, UPDATE, or DELETE operations.`,
parameters: {
type: "object",
properties: {
query: {
type: "string",
description: "A PostgreSQL SELECT query. Must start with SELECT.",
},
explanation: {
type: "string",
description: "Brief explanation of why this query is needed.",
},
},
required: ["query", "explanation"],
},
execute: async (params) => {
const query = params.query as string;
// CRITICAL: validate the query before executing
if (!isReadOnlyQuery(query)) {
return "Error: Only SELECT queries are allowed.";
}
const result = await db.query(sanitizeQuery(query));
return JSON.stringify(result.rows.slice(0, 100));
},
};Notice the explanation parameter. This is not used by the tool — it is a prompt engineering trick. Requiring the LLM to explain why it needs the query forces it to think about the query before writing it, which dramatically improves query quality. Think of it as chain-of-thought for tool calls.
Rule 2: Tool outputs should be concise and structured. The LLM reads the tool output and decides what to do next. If your tool returns a 50KB JSON blob, the LLM will either miss important details or consume your token budget on irrelevant data.
// Bad: returning raw API response
execute: async (params) => {
const response = await fetch(`https://api.example.com/users/${params.id}`);
return JSON.stringify(await response.json());
// Returns 200 fields, 15KB of data the agent doesn't need
};
// Good: returning exactly what the agent needs
execute: async (params) => {
const response = await fetch(`https://api.example.com/users/${params.id}`);
const user = await response.json();
return JSON.stringify({
id: user.id,
name: user.name,
email: user.email,
plan: user.subscription?.plan ?? "free",
createdAt: user.created_at,
});
// Returns 5 fields, ~200 bytes
};Rule 3: Make errors informative. When a tool fails, the error message is the only information the LLM has to recover. "Error: 500" is useless. "Error: User with ID 12345 not found. The user may have been deleted or the ID may be incorrect." gives the agent enough context to try a different approach.
Rule 4: Sandbox everything. Every tool call is an LLM deciding to take an action. LLMs hallucinate. They misunderstand. They get creative in ways you did not anticipate. Every tool must validate its inputs, constrain its effects, and make destructive operations impossible or at least reversible.
// A safe file-reading tool
const readFileTool: Tool = {
name: "read_file",
description: "Read the contents of a file. Only files within the project directory can be read.",
parameters: {
type: "object",
properties: {
path: {
type: "string",
description: "Relative path from the project root.",
},
},
required: ["path"],
},
execute: async (params) => {
const filePath = params.path as string;
// Prevent path traversal
const resolved = path.resolve(PROJECT_ROOT, filePath);
if (!resolved.startsWith(PROJECT_ROOT)) {
return "Error: Access denied. Path is outside the project directory.";
}
// Prevent reading sensitive files
const blocked = [".env", ".env.local", "credentials", "secrets"];
if (blocked.some((b) => filePath.includes(b))) {
return "Error: Cannot read sensitive files.";
}
// Size limit
const stats = await fs.stat(resolved);
if (stats.size > 100_000) {
return "Error: File is too large (>100KB). Try reading a specific section.";
}
const content = await fs.readFile(resolved, "utf-8");
return content;
},
};Every check in that function exists because I have personally seen an LLM try the thing it prevents. Path traversal? Yes. Reading .env files? Constantly. Trying to read a 50MB binary file? More than once.
Memory is what separates a useful agent from one that forgets what it was doing three steps ago. And memory architecture is where the largest gap between research and production exists.
Layer 1: Conversation context (short-term). This is the message array. It persists for the duration of a single agent run. It is limited by the context window. We have already covered how to compress it.
Layer 2: Working memory (session). Information the agent has gathered during this session that should persist across context compressions. Think of it as the agent's scratchpad.
type WorkingMemory = {
goal: string;
plan: string[];
completedSteps: string[];
keyFindings: Record<string, string>;
constraints: string[];
};
// The agent has tools to read and update working memory
const updateMemoryTool: Tool = {
name: "update_memory",
description: `Update your working memory with important findings or
plan changes. Use this to remember key information across steps.
Your working memory persists even when older conversation messages
are summarized.`,
parameters: {
type: "object",
properties: {
key: {
type: "string",
description: "A short key for this memory (e.g., 'user_email', 'api_status').",
},
value: {
type: "string",
description: "The information to remember.",
},
},
required: ["key", "value"],
},
execute: async (params) => {
workingMemory.keyFindings[params.key as string] = params.value as string;
return `Remembered: ${params.key} = ${params.value}`;
},
};The working memory is injected into the system prompt at each iteration, so the agent always has access to its accumulated knowledge even after the conversation history is compressed.
Layer 3: Persistent memory (cross-session). Information that should survive beyond a single agent run. User preferences, past interaction summaries, learned facts. This requires a database.
// src/lib/agent/persistent-memory.ts
import { db } from "@/lib/db";
export async function storeMemory(
agentId: string,
userId: string,
memory: {
key: string;
value: string;
category: "preference" | "fact" | "interaction" | "correction";
importance: number; // 0.0 to 1.0
}
) {
await db.agentMemory.upsert({
where: {
agentId_userId_key: {
agentId,
userId,
key: memory.key,
},
},
update: {
value: memory.value,
importance: memory.importance,
updatedAt: new Date(),
accessCount: { increment: 1 },
},
create: {
agentId,
userId,
...memory,
accessCount: 1,
},
});
}
export async function retrieveMemories(
agentId: string,
userId: string,
options: { category?: string; minImportance?: number; limit?: number }
) {
return db.agentMemory.findMany({
where: {
agentId,
userId,
...(options.category && { category: options.category }),
...(options.minImportance && {
importance: { gte: options.minImportance },
}),
},
orderBy: [
{ importance: "desc" },
{ accessCount: "desc" },
{ updatedAt: "desc" },
],
take: options.limit ?? 20,
});
}The importance score is crucial. Not all memories are equally valuable, and you have limited space in the system prompt to inject historical context. An agent that remembers everything equally remembers nothing usefully.
Before each agent run, retrieve relevant memories and inject them:
function buildSystemPrompt(
basePrompt: string,
workingMemory: WorkingMemory,
persistentMemories: AgentMemory[]
): string {
const memorySection = persistentMemories.length > 0
? `\n\n## What you know about this user:\n${persistentMemories
.map((m) => `- [${m.category}] ${m.key}: ${m.value}`)
.join("\n")}`
: "";
const workingSection = Object.keys(workingMemory.keyFindings).length > 0
? `\n\n## Current working memory:\n${Object.entries(
workingMemory.keyFindings
)
.map(([k, v]) => `- ${k}: ${v}`)
.join("\n")}`
: "";
const planSection = workingMemory.plan.length > 0
? `\n\n## Current plan:\n${workingMemory.plan
.map((step, i) => {
const done = workingMemory.completedSteps.includes(step);
return `${done ? "✓" : `${i + 1}.`} ${step}`;
})
.join("\n")}`
: "";
return basePrompt + memorySection + workingSection + planSection;
}Without planning, an agent is a random walk through tool calls. It might solve the problem. It might go in circles. It might solve a completely different problem than the one it was given. Planning transforms this chaos into directed action.
The most common agent planning pattern is ReAct (Reason + Act): the LLM thinks about what to do, takes an action, observes the result, and repeats. It is simple and works for straightforward tasks.
But ReAct fails on complex, multi-step problems. The agent thinks one step at a time and cannot anticipate downstream consequences. It is like navigating a city by only looking at the next intersection — you might reach your destination, but you will take a lot of wrong turns.
For complex tasks, I use a two-phase approach: first plan, then execute.
async function planAndExecute(
config: AgentConfig,
userGoal: string,
budget: TokenBudget
): Promise<AgentResult> {
// Phase 1: Generate a plan
const planResponse = await generateText({
model: getModel(config.planningModel ?? config.model),
messages: [
{
role: "system",
content: `You are a planning agent. Given a user's goal and
available tools, create a step-by-step plan to accomplish the
goal. Each step should specify which tool to use and what
information is needed. Output the plan as a JSON array of
steps.
Available tools: ${config.tools.map((t) =>
`${t.name}: ${t.description}`
).join("\n")}`,
},
{ role: "user", content: userGoal },
],
});
const plan = parsePlan(planResponse.text);
// Phase 2: Execute the plan, adapting as needed
return executeWithPlan(config, plan, userGoal, budget);
}
async function executeWithPlan(
config: AgentConfig,
plan: PlanStep[],
originalGoal: string,
budget: TokenBudget
): Promise<AgentResult> {
const steps: AgentStep[] = [];
const results: Record<string, string> = {};
for (let i = 0; i < plan.length; i++) {
const planStep = plan[i];
// Check if plan needs replanning based on results so far
if (planStep.dependsOn) {
const missingDependencies = planStep.dependsOn.filter(
(dep) => !(dep in results)
);
if (missingDependencies.length > 0) {
// A dependency failed — ask the LLM to replan
const newPlan = await replan(
config, originalGoal, plan, results, i
);
plan.splice(i, plan.length - i, ...newPlan);
continue;
}
}
const step = await executePlanStep(
config, planStep, results, budget
);
steps.push(step);
if (step.status === "success") {
results[planStep.id] = step.toolOutput!;
} else {
// Step failed — decide whether to retry, skip, or replan
const recovery = await decideRecovery(
config, planStep, step, results
);
if (recovery.action === "replan") {
const newPlan = await replan(
config, originalGoal, plan, results, i
);
plan.splice(i, plan.length - i, ...newPlan);
} else if (recovery.action === "skip") {
results[planStep.id] = "SKIPPED";
}
// "retry" falls through to the next iteration naturally
}
}
// Phase 3: Synthesize a final answer from all results
return synthesizeAnswer(config, originalGoal, steps, results);
}The key innovation here is replanning. When a step fails or produces unexpected results, the agent does not blindly continue with the original plan. It pauses, assesses the new situation, and creates a revised plan. This is how humans work — we adjust our approach when things go differently than expected.
One pattern I have found remarkably effective is having the agent review its own work before returning a final answer:
async function synthesizeWithReview(
config: AgentConfig,
goal: string,
steps: AgentStep[],
results: Record<string, string>
): Promise<string> {
// Generate initial answer
const draft = await generateText({
model: getModel(config.model),
messages: [
{
role: "system",
content: "Synthesize a final answer from the work done.",
},
{
role: "user",
content: `Goal: ${goal}\n\nResults:\n${JSON.stringify(results, null, 2)}`,
},
],
});
// Self-review
const review = await generateText({
model: getModel(config.model),
messages: [
{
role: "system",
content: `Review this answer for accuracy, completeness, and
whether it actually addresses the user's goal. If you find
issues, provide a corrected version. If it is good, respond
with "APPROVED:" followed by the answer.`,
},
{
role: "user",
content: `Goal: ${goal}\n\nDraft answer: ${draft.text}`,
},
],
});
if (review.text.startsWith("APPROVED:")) {
return review.text.replace("APPROVED:", "").trim();
}
return review.text; // Return the corrected version
}Yes, this costs an extra LLM call. It is worth it. In my testing, self-review catches errors in roughly 15-20% of agent outputs. That is a significant quality improvement for a marginal cost increase.
Guardrails are not a nice-to-have. They are the difference between an agent that helps users and an agent that becomes a headline. I have spent more engineering time on guardrails than on the agent loop itself, and I do not regret a single hour.
Before the agent even starts, validate the user's request:
async function validateInput(
userGoal: string,
config: AgentConfig
): Promise<{ valid: boolean; reason?: string }> {
// Length check
if (userGoal.length > 10_000) {
return { valid: false, reason: "Input too long." };
}
// Content policy check using a fast, cheap model
const check = await generateText({
model: getModel("fast"),
messages: [
{
role: "system",
content: `Classify this request as SAFE or UNSAFE. A request is
UNSAFE if it asks the agent to: access other users' data,
perform destructive operations, bypass security controls,
generate harmful content, or do anything outside the scope of
${config.scope}. Respond with only "SAFE" or "UNSAFE: reason".`,
},
{ role: "user", content: userGoal },
],
maxTokens: 50,
});
if (check.text.startsWith("UNSAFE")) {
return {
valid: false,
reason: check.text.replace("UNSAFE:", "").trim(),
};
}
return { valid: true };
}After the agent finishes, validate its output before returning it to the user:
async function validateOutput(
output: string,
userGoal: string,
config: AgentConfig
): Promise<{ safe: boolean; sanitized: string }> {
// Check for PII leakage
const piiPatterns = [
/\b\d{3}-\d{2}-\d{4}\b/, // SSN
/\b\d{16}\b/, // Credit card
/\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,}\b/i, // Email (check if it belongs to the user)
];
let sanitized = output;
for (const pattern of piiPatterns) {
sanitized = sanitized.replace(pattern, "[REDACTED]");
}
// Check for hallucinated confidence
const dangerousPhrases = [
"I can confirm",
"I have verified",
"This is guaranteed",
"I am certain",
];
for (const phrase of dangerousPhrases) {
if (sanitized.toLowerCase().includes(phrase.toLowerCase())) {
sanitized = sanitized.replace(
new RegExp(phrase, "gi"),
"Based on the available information"
);
}
}
return { safe: true, sanitized };
}That dangerousPhrases check might seem extreme, but I added it after an agent told a user "I have verified your payment was processed" based on a misread API response. The payment had not been processed. The user lost money because they trusted the agent's confident language. Agents should never express certainty about real-world state.
Per-user, per-session, and per-day cost caps:
type AgentLimits = {
maxRequestsPerMinute: number;
maxRequestsPerDay: number;
maxCostPerSession: number; // dollars
maxCostPerDay: number; // dollars
maxConcurrentSessions: number;
};
const DEFAULT_LIMITS: AgentLimits = {
maxRequestsPerMinute: 5,
maxRequestsPerDay: 100,
maxCostPerSession: 0.50, // 50 cents per session
maxCostPerDay: 5.00, // $5 per day per user
maxConcurrentSessions: 2,
};At $0.50 per session, a single user can only cost you $50/day even if they hammer the agent all day. This is the kind of math that keeps your startup alive.
Some tasks are too complex for a single agent. They require different expertise, different tool sets, or different levels of caution. This is where multi-agent systems come in.
The pattern that works best in production is manager-worker orchestration:
// src/lib/agent/orchestrator.ts
type SpecializedAgent = {
id: string;
name: string;
description: string;
config: AgentConfig;
budget: TokenBudget;
};
async function orchestrate(
manager: AgentConfig,
workers: SpecializedAgent[],
userGoal: string,
totalBudget: TokenBudget
): Promise<AgentResult> {
// The manager decides which workers to invoke and in what order
const plan = await generateText({
model: getModel(manager.model),
messages: [
{
role: "system",
content: `You are a manager agent. You have access to these
specialized workers:
${workers.map((w) => `- ${w.name}: ${w.description}`).join("\n")}
Decompose the user's goal into subtasks and assign each to
the most appropriate worker. Output a JSON array of assignments.`,
},
{ role: "user", content: userGoal },
],
});
const assignments = parseAssignments(plan.text);
const results: Record<string, AgentResult> = {};
// Execute assignments (some may be parallel, some sequential)
for (const group of assignments.executionGroups) {
const groupResults = await Promise.all(
group.map(async (assignment) => {
const worker = workers.find((w) => w.id === assignment.workerId);
if (!worker) throw new Error(`Unknown worker: ${assignment.workerId}`);
const result = await runProductionAgent({
config: worker.config,
userGoal: assignment.subtask,
budget: worker.budget,
timeout: 60_000,
});
return { id: assignment.id, result };
})
);
for (const { id, result } of groupResults) {
results[id] = result;
}
}
// Manager synthesizes the final answer
return synthesizeFromWorkers(manager, userGoal, results);
}The manager agent has no tools — it only plans and delegates. The worker agents have specialized tool sets and focused system prompts. This separation is important because it keeps each agent's context clean and its tool set small.
A real example: I built an analysis agent that processes user requests. The manager receives the goal, then delegates to three specialists:
Each worker is tightly scoped. The data collector cannot write reports. The writer cannot query databases. The analyzer cannot collect new data. This minimizes the blast radius of any single agent going wrong.
Let me catalog the failure modes I have seen in production, because they will save you weeks of debugging.
Symptom: Agent keeps calling the same tool with the same parameters, getting the same result, and trying again.
Cause: The LLM is not receiving enough signal that its approach is not working.
Fix: Track tool call history and inject a message when a loop is detected:
function detectLoop(steps: AgentStep[]): boolean {
if (steps.length < 3) return false;
const recent = steps.slice(-3);
const allSame = recent.every(
(s) =>
s.toolName === recent[0].toolName &&
JSON.stringify(s.toolInput) === JSON.stringify(recent[0].toolInput)
);
return allSame;
}
// In the agent loop:
if (detectLoop(steps)) {
messages.push({
role: "system",
content: `WARNING: You have called the same tool with the same
parameters 3 times in a row. The result will not change. Try a
completely different approach or conclude with what you know.`,
});
}Symptom: Agent gets slower and slower, then starts producing incoherent responses.
Cause: A tool returned a massive response (e.g., a full database dump) and the context window is now dominated by tool output, drowning out the original goal and instructions.
Fix: The truncateOutput function I showed earlier, plus context monitoring:
function truncateOutput(output: string, maxChars: number): string {
if (output.length <= maxChars) return output;
const truncated = output.slice(0, maxChars);
return (
truncated +
`\n\n[OUTPUT TRUNCATED: ${output.length} chars total, showing first ${maxChars}. ` +
`If you need more detail, try a more specific query.]`
);
}Symptom: Agent returns a beautifully formatted, completely wrong answer with absolute confidence.
Cause: The LLM filled gaps in its knowledge with plausible-sounding fabrications. This happens most often when a tool returns an error and the agent "improvises" instead of admitting the failure.
Fix: The output validation layer I showed earlier, plus explicit instructions in the system prompt:
If a tool call fails or returns unexpected results, say so. Never
fabricate data. Never claim to have verified something you did not
verify. If you cannot complete a task, explain what you tried and
what blocked you. An honest "I could not do this" is always better
than a confident wrong answer.
Symptom: User asks agent to check their last order status. Agent decides to also update their shipping address, send a confirmation email, and apply a discount code.
Cause: The LLM's instinct to be helpful overrides its constraint to only do what was asked.
Fix: Explicit scope constraints in the system prompt, plus a scope checker:
async function checkScope(
toolCall: ToolCall,
originalGoal: string,
config: AgentConfig
): Promise<boolean> {
const check = await generateText({
model: getModel("fast"),
messages: [
{
role: "system",
content: `Is this tool call within the scope of the user's
original request? Respond "YES" or "NO: reason".
Original request: "${originalGoal}"
Tool: ${toolCall.toolName}
Arguments: ${JSON.stringify(toolCall.args)}`,
},
],
maxTokens: 30,
});
return check.text.startsWith("YES");
}This adds latency and cost. Use it judiciously — for high-risk tools (database writes, emails, payments), always. For read-only tools, it is usually unnecessary.
Let me get concrete about money, because this is where agent projects either survive or die.
A typical agent run on Claude Sonnet or GPT-4o looks like this:
| Component | Input Tokens | Output Tokens | Cost |
|---|---|---|---|
| System prompt | ~2,000 | — | $0.006 |
| Planning step | ~3,000 | ~500 | $0.012 |
| 5 tool iterations | ~25,000 | ~2,500 | $0.085 |
| Self-review | ~4,000 | ~800 | $0.016 |
| Total | ~34,000 | ~3,800 | ~$0.12 |
At $0.12 per run, if each user averages 10 agent runs per day, each active user costs you $1.20/day or about $36/month. If you are charging $29.99/month, you are losing money on every active user. This is a real problem that has killed several AI agent startups.
Use smaller models where possible. Input validation, scope checking, memory retrieval — these tasks do not need a frontier model. Use the cheapest model that works.
Cache aggressively. If two users ask the same question, the second run should cost near-zero. Cache at the tool result level, the plan level, and the final answer level.
Minimize context size. Every token in the context is paid for on every iteration. Shorter system prompts, compressed histories, and truncated tool outputs compound savings across iterations.
Set hard per-session budgets. The math above assumes 5 iterations. But without a cap, some sessions will have 15 iterations and cost $0.40+. Hard caps prevent tail costs from destroying your unit economics.
const PRICING_TIERS = {
free: {
maxAgentRunsPerDay: 5,
budgetPerRun: 0.05, // $0.05 — ~2 iterations with a cheap model
model: "fast",
},
pro: {
maxAgentRunsPerDay: 50,
budgetPerRun: 0.25, // $0.25 — ~5 iterations with a good model
model: "quality",
},
business: {
maxAgentRunsPerDay: 200,
budgetPerRun: 1.00, // $1.00 — complex multi-step tasks
model: "quality",
},
};You cannot improve what you cannot measure, and measuring agent quality is harder than measuring a traditional ML model. There is no single metric. Here is the framework I use.
1. Task Completion Rate. Did the agent successfully complete the user's goal? This requires defining "success" per task type, which is itself non-trivial. I use a combination of automated checks (did the tool calls succeed?) and LLM-as-judge evaluation (did the output address the user's goal?).
2. Step Efficiency. How many iterations did the agent take versus the theoretical minimum? An agent that takes 12 steps to accomplish a 3-step task is wasting tokens and user time.
3. Tool Accuracy. Did the agent choose the right tool on the first try? Measured as the percentage of tool calls that contributed to the final result versus total tool calls.
4. Cost Per Successful Task. Total spend divided by successful completions. This is the metric your finance team cares about.
5. User Satisfaction. The only metric that ultimately matters. Measured via explicit feedback (thumbs up/down) and implicit signals (did the user retry, did they accomplish their goal, did they come back).
// src/lib/agent/eval.ts
type EvalResult = {
taskCompleted: boolean;
stepsUsed: number;
theoreticalMinSteps: number;
efficiency: number;
toolAccuracy: number;
totalCost: number;
durationMs: number;
};
async function evaluateAgentRun(
run: AgentResult,
expectedOutcome: string
): Promise<EvalResult> {
// Use LLM-as-judge to check task completion
const judgment = await generateText({
model: getModel("quality"),
messages: [
{
role: "system",
content: `You are evaluating an AI agent's output. Did the
agent successfully accomplish the goal? Respond with a JSON
object: { "completed": boolean, "reason": string }`,
},
{
role: "user",
content: `Goal: ${expectedOutcome}\nAgent output: ${run.result}`,
},
],
});
const parsed = JSON.parse(judgment.text);
// Calculate tool accuracy
const usefulSteps = run.steps.filter((s) => s.status === "success");
const toolAccuracy = usefulSteps.length / Math.max(run.steps.length, 1);
return {
taskCompleted: parsed.completed,
stepsUsed: run.steps.length,
theoreticalMinSteps: estimateMinSteps(run),
efficiency: estimateMinSteps(run) / Math.max(run.steps.length, 1),
toolAccuracy,
totalCost: run.totalCost,
durationMs: run.durationMs,
};
}Run these evaluations on a suite of test cases every time you change the agent's system prompt, tools, or model. Agent behavior is non-deterministic, so run each test case multiple times and look at distributions, not single results.
Eight months of building production agents has taught me several things I wish I knew at the start:
Start with the tools, not the agent. Define and test your tools independently before connecting them to an agent loop. A tool that returns inconsistent results will make your agent inconsistent regardless of how good your loop is.
Use the cheapest model that works. I started with frontier models for everything and burned through budget. Most agent tasks — planning, tool selection, output formatting — work fine with smaller models. Save the expensive models for the final synthesis step.
Log everything. Every LLM call, every tool execution, every token count, every latency measurement. You will need this data to debug issues, optimize costs, and evaluate quality. Build comprehensive logging from day one.
Invest in guardrails early. The first time your agent does something unexpected in production, you will wish you had spent the extra week on safety checks. And "unexpected" is not a hypothetical — it is a certainty.
Talk to users before building. The agent I built in my head was much more complex than what users actually needed. Start with the simplest possible agent that does one thing well, then expand based on actual user feedback.
We are in the very early innings of AI agents. The models are getting better, the tools are getting standardized, and the patterns are solidifying. Here is what I expect in the next year:
Tool calling will become a first-class primitive. Today, tools are bolted onto chat APIs. Soon, they will be a core part of how we interact with models. Anthropic's MCP (Model Context Protocol) is leading this — it standardizes how agents discover and use tools, and it will be the USB of AI agents.
Agents will compose with other agents. The multi-agent pattern I described will become the default, with standardized protocols for inter-agent communication. Your agent will be able to call my agent as a tool, and vice versa.
Evaluation will mature. We will have better benchmarks, better metrics, and better tooling for evaluating agent quality. The current state — manual testing and vibes-based assessment — is not sustainable.
Cost will drop dramatically. Model costs have fallen 10x in the past 18 months and will continue falling. Tasks that are uneconomical today will be dirt cheap by 2027. Build the architecture now, even if the unit economics are tight.
The teams that are building agent infrastructure today — the tool systems, the memory layers, the evaluation pipelines, the guardrails — will have a massive advantage when the cost curve makes agents economically viable for every use case. The foundation you lay now is the product you ship later.
Build carefully. Test relentlessly. Ship with guardrails. And remember: the goal is not to build the most sophisticated agent. The goal is to build the agent that most reliably helps your users accomplish their goals. Reliability beats sophistication, every single time.