OpenAI Agents SDK vs Claude Agent SDK: Building Production Agents in TypeScript

Why TypeScript teams are comparing OpenAI Agents SDK and Claude Agent SDK in 2026

The agent SDK race has become one of the more consequential infrastructure bets for backend and platform engineers. Two opinionated, batteries-included libraries now dominate the conversation: OpenAI's Agents SDK (@openai/agents, TypeScript-first with a Python port) and Anthropic's Claude Agent SDK (@anthropic-ai/claude-agent-sdk, the same engine that powers Claude Code, exposed as a programmable interface). Both are production-grade. Both ship strong ecosystems. The choice is less about raw model benchmarks and more about which orchestration model matches how your team designs systems.

Behind a Node.js API, the decision shows up in code review immediately: handoffs versus query() delegation, guardrails versus hooks, MCP as a property versus MCP as architecture. These are system design questions, not marketing slides.

This article focuses on building production agents in TypeScript. It synthesizes ground-level comparisons from practitioner write-ups—including the detailed breakdown at Developers Digest (https://www.developersdigest.tech/blog/openai-agents-sdk-vs-claude-agent-sdk)—and translates them into patterns you can ship behind Express, Fastify, or serverless handlers. The goal is not to crown a winner. It is to help you match SDK primitives to workload shape before you commit a quarter of platform work to the wrong abstraction.

What each SDK actually is

The OpenAI Agents SDK was designed as a TypeScript-first framework for defining agents, wiring tools, routing between agents via handoffs, and applying guardrails at input and output boundaries. Its conceptual model is explicit: an Agent object holds instructions, a model name, tools, and potential handoffs to other agents. The run() function (or Runner.run()) drives the loop. Zod schemas define tool parameters and structured output types. The library is MIT-licensed and exposes a Model interface so you can point at non-OpenAI backends in theory, but defaults and hosted tools are tightly coupled to OpenAI infrastructure.

The Claude Agent SDK is different in character. It is the Claude Code engine made programmable. Where OpenAI asks you to construct agents declaratively, Claude's primary interface is query(): you describe a task, declare which tools are allowed, and the agent loop runs. The SDK ships built-in tools—Read, Write, Edit, Bash, Glob, Grep, WebSearch, WebFetch, Monitor, AskUserQuestion—that execute in-process as MCP servers. Subagents are AgentDefinition objects invoked by the orchestrating agent through the Agent tool. The TypeScript package bundles its own Claude Code binary; you do not need a separate CLI install for production.

Both SDKs sit above raw API clients. Single model calls belong in @anthropic-ai/sdk or the Responses API; autonomous loops with tools and safety contracts belong in SDK territory.

Orchestration: handoffs versus query() and subagents

This is where the two SDKs diverge most visibly—and where TypeScript ergonomics either accelerate or fight your team's mental model.

OpenAI uses handoffs as its primary multi-agent primitive. A handoff is a tool exposed to the model. If you hand off to an agent named Refund Agent, the model sees a tool called transfer_to_refund_agent. The model decides when to invoke it, transfers conversation state, and control shifts. The receiving agent takes over; the original agent steps back. That pattern suits conversation routing: billing escalations, tiered support, or specialist ownership of the next leg of a thread.

Claude uses subagents instead. You define named agents with AgentDefinition (description, system prompt, allowed tools) and include Agent in the parent's allowedTools. The orchestrating model spawns a subagent, delegates a task, and receives the result back in context. The parent retains control throughout. Subagent messages carry a parent_tool_use_id field for tracing. The mental model is closer to function calls that happen to be autonomous agents—delegation, not transfer of ownership.

Neither model is strictly superior. Handoffs fit specialist ownership of the next conversation leg; subagents fit coordinators that delegate and synthesize. The SDK you pick determines which pattern is native.

OpenAI handoffs in TypeScript

OpenAI lets you customize handoffs with handoff(): set inputType for structured payloads on transfer, onHandoff callbacks, inputFilter to control what history the new agent sees, and isEnabled predicates to conditionally expose routes. Alternatively, compose agents as tools via agent.asTool(), where the subordinate returns a result to the caller instead of taking over the conversation. That hybrid is useful when you want specialist reasoning without losing the coordinator's thread.

import { Agent, handoff, run, tool } from '@openai/agents';
import { z } from 'zod';

const refundAgent = new Agent({
  name: 'Refund Agent',
  instructions: 'Process refund requests within policy limits.',
  tools: [/* billing tools */],
});

const triageAgent = new Agent({
  name: 'Triage Agent',
  instructions: 'Classify support tickets and route specialists.',
  handoffs: [
    handoff(refundAgent, {
      inputType: z.object({ orderId: z.string(), reason: z.string() }),
      onHandoff: async (ctx, input) => {
        await auditLog.write({ event: 'handoff', target: 'refund', ...input });
      },
    }),
  ],
});

const result = await run(triageAgent, 'Customer wants a refund for order ORD-4421');

Routing is model-driven—log which transfer tool fired and what payload crossed the boundary.

Claude query() and subagent delegation

Claude's entry point is query() with ClaudeAgentOptions: prompt, allowed tools, optional subagent definitions, permission modes, and hook registrations. The orchestrator stays in charge; subagents return results. That makes state reasoning simpler for backend engineers who think in request-response cycles, but you must be explicit about task decomposition in prompts and agent descriptions.

import { query, type AgentDefinition } from '@anthropic-ai/claude-agent-sdk';

const refundSubagent: AgentDefinition = {
  name: 'refund-agent',
  description: 'Handles refund eligibility and policy checks',
  prompt: 'You process refunds. Never exceed policy limits.',
  tools: ['Read', 'WebFetch'],
};

for await (const message of query({
  prompt: 'Customer wants a refund for order ORD-4421. Triage and delegate.',
  options: {
    allowedTools: ['Agent', 'Grep', 'Read'],
    agents: { 'refund-agent': refundSubagent },
    permission_mode: 'default',
  },
})) {
  if (message.type === 'result') {
    await notifySlack(message.result);
  }
}

The async iterator streams to React or SSE clients; parent_tool_use_id traces rebuild delegation trees in OpenTelemetry.

Safety layers: guardrails versus hooks

For tool validation and lifecycle control, the platforms make opposite choices. Understanding that split is essential before you promise compliance stakeholders a unified safety story.

OpenAI guardrails at input, output, and tool boundaries

OpenAI leans on guardrails—dedicated validation objects at execution boundaries. Input guardrails run on the first user message. Output guardrails run on the final agent response. Tool guardrails on individual tool() definitions run before and after each function-tool invocation. Each guardrail returns a GuardrailFunctionOutput that allows the request, rejects it with a message, or throws a tripwire to halt the entire run.

Guardrails run in parallel (default) or sequentially before the model call. The limitation: they do not intercept handoffs, hosted tools, or computerTool/shellTool—cover those with router logic or infrastructure isolation.

import { Agent, run, tool, type InputGuardrail } from '@openai/agents';
import { z } from 'zod';

const blockSecrets: InputGuardrail = {
  name: 'block-secrets',
  execute: async ({ input }) => {
    const text = typeof input === 'string' ? input : JSON.stringify(input);
    if (/sk-[a-zA-Z0-9]{20,}/.test(text)) {
      return { tripwireTriggered: true, outputInfo: { reason: 'API key detected' } };
    }
    return { tripwireTriggered: false, outputInfo: {} };
  },
};

const searchOrders = tool({
  name: 'search_orders',
  parameters: z.object({ email: z.string().email() }),
  execute: async ({ email }) => db.orders.findByEmail(email),
  toolGuardrails: {
    pre: async (ctx) => {
      if (!ctx.context.tenantId) throw new Error('Missing tenant scope');
    },
  },
});

const agent = new Agent({
  name: 'Support',
  instructions: 'Help customers with order status.',
  tools: [searchOrders],
  inputGuardrails: [blockSecrets],
});

Claude hooks across built-in and MCP tools

Claude uses hooks—callbacks for lifecycle events: PreToolUse, PostToolUse, Stop, SessionStart, SessionEnd, UserPromptSubmit, and others. A HookMatcher binds a hook to tool name patterns (for example, Edit|Write). Hooks can log, transform, block, or audit any tool invocation. Built-in tools like Bash and Edit flow through the same pipeline as custom MCP tools, giving consistent observability across everything the agent does.

The trade-off is imperative code—no built-in tripwire; blocking returns a specific hook structure. Teams familiar with Node.js middleware often prefer this uniformity.

import { query } from '@anthropic-ai/claude-agent-sdk';

for await (const message of query({
  prompt: 'Refactor src/billing/refund.ts for idempotency',
  options: {
    allowedTools: ['Read', 'Edit', 'Bash', 'Grep'],
    hooks: {
      PreToolUse: [
        {
          matcher: 'Edit|Write|Bash',
          hooks: [async (input) => {
            if (input.tool_input?.command?.includes('rm -rf')) {
              return { decision: 'block', reason: 'Destructive command blocked' };
            }
            return { decision: 'approve' };
          }],
        },
      ],
    },
  },
})) {
  // stream to client
}

In regulated environments, hooks shine when the same policy must apply to filesystem edits and database MCP tools alike. Guardrails shine when you want fast parallel validation on user input before any model spend occurs.

MCP integration patterns in production TypeScript services

Model Context Protocol has become the lingua franca for tool servers—Playwright browsers, Postgres connectors, internal admin APIs. Both SDKs support MCP, but with different emphasis, and that shapes how you structure monorepos versus sidecar deployments.

OpenAI mcpServers as one tool source among many

OpenAI added mcpServers as an agent-level property, letting you mount MCP-backed tools alongside native tool() definitions. Documentation treats MCP as one tool source among several—hosted tools, function tools, computer use, and handoffs coexist on the same agent. For teams that already defined Zod-typed function tools and only need a few MCP integrations, that compositional style feels natural.

import { Agent } from '@openai/agents';

const dataAgent = new Agent({
  name: 'Analytics Agent',
  instructions: 'Answer product questions using warehouse data.',
  mcpServers: [
    {
      name: 'warehouse',
      transport: { type: 'stdio', command: 'node', args: ['./mcp/warehouse-server.js'] },
    },
  ],
  tools: [/* native Zod tools */],
});

Treat each MCP server as a subprocess with resource limits, secret injection, and version pinning.

Claude SDK: MCP as architecture, not an add-on

The Claude Agent SDK is architecturally built around MCP. Built-in tools are implemented as in-process MCP servers. Connecting external MCP servers—Playwright, database connectors, custom internal tooling—is a first-class documented pattern. Claude Code's adoption has driven MCP ecosystem growth; teams with existing MCP investments often find Claude's SDK a natural host.

import { query } from '@anthropic-ai/claude-agent-sdk';

for await (const message of query({
  prompt: 'Run smoke tests in staging and summarize failures',
  options: {
    allowedTools: ['Read', 'Bash', 'mcp__playwright__*'],
    mcpServers: {
      playwright: {
        command: 'npx',
        args: ['@playwright/mcp-server'],
      },
      incidents: {
        command: 'node',
        args: ['./mcp/incidents-server.js'],
      },
    },
  },
})) {
  // handle messages
}

MCP-standardized platforms benefit from Claude's uniform hook pipeline; mixed hosted-sandbox plus selective MCP workloads often fit OpenAI's compositional agent model.

Sandboxing, hosted tools, and deployment reality

OpenAI offers sandbox agents—isolated filesystem workspaces where an agent reads, writes, and executes code without touching the host. This is hosted execution: a clean container, shellTool and computerTool for commands and GUI interaction, and applyPatchTool for structured edits. For coding agents or untrusted code execution, that removes substantial infrastructure work. The cost is platform coupling: the sandbox requires OpenAI infrastructure.

Claude does not ship an equivalent hosted sandbox in the SDK. Execution runs in the local environment where the SDK operates. You control isolation via permission_mode (for example, acceptEdits for auto-approved edits) and by restricting allowedTools. Production deployments bring your own containerization—Docker, gVisor, Firecracker microVMs, or ephemeral CI runners. More flexibility, more setup.

OpenAI's hosted sandbox suits bursty untrusted code; Claude suits hardened CI runners where you already control the filesystem. Neither replaces gateway-level tenant isolation.

Head-to-head comparison for production TypeScript agents

Dimension	OpenAI Agents SDK	Claude Agent SDK
Primary language	TypeScript-first (Python port)	Python and TypeScript parity
Core orchestration	Handoffs (agent-to-agent transfer)	Subagents via query() delegation
Entry point	Agent + run() / Runner.run()	query() async iterator
Tool definition	tool() + Zod schema	Built-in tools + MCP servers
Safety layer	Input / output / tool guardrails with tripwires	Hooks (PreToolUse, PostToolUse, etc.)
Sandboxed execution	Hosted sandbox with shellTool, computerTool	Bring-your-own containerization
MCP support	Agent-level mcpServers property	Architecture-level, in-process built-ins
Structured output	Zod schema on outputType	Prompt and model capabilities
Model portability	Custom Model interface (OpenAI-native defaults)	Bedrock, Vertex AI, Azure, Anthropic API
License	MIT	Proprietary SDK; usage metered

The table reflects ground-level comparisons from practitioner evaluations in mid-2026, including the Developers Digest analysis cited above. Your workload should break ties—not row counts.

Model portability, pricing signals, and lock-in

Both platforms exert gravitational pull toward their own models. OpenAI's Model interface allows substitution, but hosted tools—code interpreter, file search, computer use—require OpenAI infrastructure. If you need the sandbox, you are on OpenAI's platform. Defaults assume GPT-4.x or GPT-5.x class models with the Responses API.

Claude's SDK documents first-party auth paths for Bedrock, Vertex AI, Azure AI Foundry, and the Anthropic API. The underlying model remains Claude; there is no generic model substitution interface because the SDK wraps Claude Code's execution engine. Multi-cloud deployment is real; multi-model flexibility is not the SDK's goal.

Pricing changes frequently—verify at evaluation time. Anthropic's June 2026 docs note a separate Agent SDK credit pool on subscriptions; OpenAI charges tokens plus hosted tool usage. Instrument per-tenant spend from day one.

Decision guide for backend and platform teams

Choose the OpenAI Agents SDK if your team is TypeScript-first and wants ergonomic Zod-typed tools and structured outputs. Choose it if you need hosted sandboxed code execution without managing containers. Choose it when multi-agent workflows are conversation-routing problems—handing off to a specialist fits better than delegating a subtask. Choose it if you are invested in OpenAI hosted tools like code interpreter and file search.

Choose the Claude Agent SDK if you build agents that work with codebases, filesystems, and shell commands—the built-in suite is a head start. Choose it if you have existing MCP servers or want the MCP ecosystem Claude Code accelerated. Choose it if you want fine-grained lifecycle control via hooks with a consistent interface across built-in and custom tools. Choose it for Bedrock, Vertex, or Azure first-party auth. Choose it when you want the same engine that powers Claude Code for developer workflow automation.

Consider LangGraph when provider flexibility or explicit graph control is non-negotiable. Evaluate both SDKs on tool shape, delegation style, and safety contracts—not brand preference.

Code example: the same support workflow in both SDKs

Imagine a support automation service in Node.js: classify intent, either answer from FAQ or delegate to a specialist, enforce tenant scope, and stream progress to a React admin UI. The two SDKs structure that workflow differently.

With OpenAI, you model specialists as handoff targets and enforce tenant scope in tool guardrails. The triage agent may transfer ownership to billing; guardrails run on the first message but not on the handoff itself—so your onHandoff callback should re-validate tenant context before the transfer completes.

With Claude, you model specialists as subagents and enforce scope in PreToolUse hooks that cover Read, MCP database tools, and Bash alike. The coordinator prompt explicitly instructs when to spawn the billing subagent; tracing uses parent_tool_use_id to build a delegation tree for your observability backend.

Hybrid architectures are common: OpenAI for customer chat with hosted sandboxes, Claude for internal repo automation over MCP databases, linked by shared Zod or OpenAPI contracts.

Summary

OpenAI Agents SDK and Claude Agent SDK are both production-grade paths to autonomous agents in TypeScript, but they optimize for different orchestration and safety models. OpenAI centers declarative agents, Zod-typed tools, handoffs for conversation transfer, and guardrails at input, output, and function-tool boundaries—with hosted sandbox execution as a major differentiator. Claude centers query(), subagent delegation with parent retention of control, hooks that uniformly wrap built-in and MCP tools, and architecture-level MCP integration rooted in the Claude Code engine.

Before committing, prototype your highest-risk workflow in both SDKs for a week—not a afternoon demo. Measure token cost, trace handoff versus delegation trees, and verify which safety primitives actually cover your hosted tools and MCP surfaces. The right choice is the one your platform team can operate with clear runbooks, not the one with the newest release notes.

FAQ

Can I use the OpenAI Agents SDK with non-OpenAI models?

The SDK exposes a Model interface for custom providers, so technically yes. In practice, hosted tools—sandbox execution, code interpreter, file search—require OpenAI infrastructure. Pure LLM-plus-function-tool agents can use the custom Model escape hatch; sandbox-dependent agents cannot.

Does the Claude Agent SDK require Claude Code installed separately?

For TypeScript, no—the package bundles its own Claude Code binary. For Python, the SDK calls the Claude Code engine under the hood without requiring a separate CLI in production. You need a valid ANTHROPIC_API_KEY or configured cloud credentials for Bedrock, Vertex, or Azure.

How do handoffs differ from subagents in practice?

Handoffs transfer conversation ownership; the receiving agent continues the thread. Subagents execute a delegated task and return a result while the orchestrator retains control. Handoffs model routing; subagents model task decomposition. Many pipelines need both concepts at different layers.

Which SDK is better for MCP-heavy agent architectures?

Claude's SDK treats MCP as architectural—built-in tools are in-process MCP servers, and external servers are first-class. OpenAI supports agent-level mcpServers alongside other tool types, which suits mixed workloads that also rely on hosted OpenAI tools. Teams standardized on MCP across all tool surfaces often lean Claude; teams mixing hosted sandboxes and selective MCP lean OpenAI.

Should I pick an SDK or use LangGraph for orchestration?

Pick a vendor SDK when you want batteries-included loops, tools, and safety primitives tied to a provider roadmap. Pick LangGraph when you need explicit graph control flow, provider flexibility, or deep LangChain ecosystem integration. SDKs optimize time-to-production; graphs optimize control and portability.

What safety model fits regulated production workloads?

OpenAI guardrails offer declarative tripwires at boundaries—strong for blocking bad input before spend—but do not cover handoffs or hosted execution tools. Claude hooks offer uniform tool interception—including Bash and Edit—but require imperative policy code. Regulated teams often combine either SDK with infrastructure controls: network policy, secret scanners, human approval queues, and audit logs at the API gateway regardless of SDK choice.