Claude Agent SDK: Building Production AI Agents — A Developer's Guide
Complete developer guide to Anthropic's Claude Agent SDK — installation, agent creation, tools, multi-agent patterns, error handling, and production deployment.
What Is the Claude Agent SDK?
The Claude Agent SDK is Anthropic's framework for building AI agents powered by Claude models. It provides a structured approach to creating agents that can reason through complex tasks, use tools, maintain conversation state, and collaborate with other agents through handoffs. Unlike generic orchestration frameworks, the Claude Agent SDK is optimized for Claude's unique strengths — extended thinking, long-context reasoning, and reliable tool use.
The SDK is designed for production use from day one, with built-in support for error handling, retry logic, streaming responses, and observability. Whether you are building a customer support agent, a data analysis assistant, or a complex multi-agent workflow, the Claude Agent SDK provides the primitives you need without the overhead you do not.
Installation and Setup
Install the SDK
pip install anthropic
The agent capabilities are built into the core Anthropic Python SDK. No separate package is needed.
Configure Your API Key
Set your Anthropic API key as an environment variable:
export ANTHROPIC_API_KEY="sk-ant-your-key-here"
For production, use a secrets manager (AWS Secrets Manager, HashiCorp Vault, or Kubernetes secrets) rather than environment variables in shell profiles.
Verify the setup:
from anthropic import Anthropic
client = Anthropic()
message = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=100,
messages=[{"role": "user", "content": "Hello!"}],
)
print(message.content[0].text)
Creating Your First Agent
An agent in the Claude ecosystem is a Claude model instance configured with specific instructions, tools, and behavioral constraints. Here is a minimal agent:
from anthropic import Anthropic
client = Anthropic()
def run_agent(user_message: str) -> str:
"""Run a simple agent loop with tool use."""
system_prompt = """You are a helpful research assistant.
You help users find information and answer questions
accurately. If you are unsure about something, say so
rather than guessing."""
messages = [{"role": "user", "content": user_message}]
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=4096,
system=system_prompt,
messages=messages,
)
return response.content[0].text
This is a single-turn agent. To make it agentic — capable of using tools and reasoning in loops — we need to add tool definitions and an execution loop.
Defining Tools
Tools are functions that the agent can call to interact with external systems. Claude's tool use protocol requires you to define tools with JSON schemas:
tools = [
{
"name": "search_database",
"description": (
"Search the product database by name or category. "
"Returns matching products with prices and stock."
),
"input_schema": {
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "Search query for products",
},
"category": {
"type": "string",
"description": "Optional product category",
"enum": [
"electronics",
"clothing",
"home",
"sports",
],
},
"max_results": {
"type": "integer",
"description": "Max results to return",
"default": 5,
},
},
"required": ["query"],
},
},
{
"name": "get_order_status",
"description": (
"Look up the current status of a customer order "
"by order ID. Returns status, tracking info, "
"and estimated delivery date."
),
"input_schema": {
"type": "object",
"properties": {
"order_id": {
"type": "string",
"description": "The order ID (format: ORD-XXXXX)",
},
},
"required": ["order_id"],
},
},
{
"name": "create_support_ticket",
"description": (
"Create a support ticket for issues that require "
"human follow-up. Use when you cannot resolve the "
"issue directly."
),
"input_schema": {
"type": "object",
"properties": {
"subject": {
"type": "string",
"description": "Brief subject line",
},
"description": {
"type": "string",
"description": "Detailed issue description",
},
"priority": {
"type": "string",
"enum": ["low", "medium", "high", "urgent"],
},
},
"required": ["subject", "description", "priority"],
},
},
]
The quality of your tool descriptions directly impacts the agent's ability to use them correctly. Write descriptions as if you are explaining the tool to a new team member — be specific about what it does, when to use it, and what it returns.
See AI Voice Agents Handle Real Calls
Book a free demo or calculate how much you can save with AI voice automation.
Building the Agent Loop
The agent loop is where single-turn chat becomes agentic behavior. The loop sends a message to Claude, checks if the response contains tool calls, executes those tools, feeds the results back, and repeats until Claude produces a final text response:
import json
def execute_tool(tool_name: str, tool_input: dict) -> str:
"""Execute a tool and return the result as a string."""
if tool_name == "search_database":
# In production, this queries your actual database
return json.dumps({
"results": [
{
"name": "Wireless Headphones",
"price": 79.99,
"in_stock": True,
},
{
"name": "Bluetooth Speaker",
"price": 49.99,
"in_stock": True,
},
]
})
elif tool_name == "get_order_status":
return json.dumps({
"order_id": tool_input["order_id"],
"status": "shipped",
"tracking": "1Z999AA10123456784",
"eta": "2026-03-18",
})
elif tool_name == "create_support_ticket":
return json.dumps({
"ticket_id": "TKT-8842",
"status": "created",
})
return json.dumps({"error": f"Unknown tool: {tool_name}"})
def run_agent_loop(
user_message: str,
system_prompt: str,
max_iterations: int = 10,
) -> str:
"""Run the full agent loop with tool execution."""
messages = [{"role": "user", "content": user_message}]
for iteration in range(max_iterations):
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=4096,
system=system_prompt,
tools=tools,
messages=messages,
)
# Check if the response contains tool use
if response.stop_reason == "tool_use":
# Add assistant's response to conversation
messages.append({
"role": "assistant",
"content": response.content,
})
# Execute each tool call
tool_results = []
for block in response.content:
if block.type == "tool_use":
result = execute_tool(
block.name, block.input
)
tool_results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": result,
})
# Add tool results to conversation
messages.append({
"role": "user",
"content": tool_results,
})
else:
# Agent produced a final response
final_text = ""
for block in response.content:
if hasattr(block, "text"):
final_text += block.text
return final_text
return "Agent reached maximum iterations."
This loop is the heart of every Claude agent. The pattern is always the same: send, check for tool use, execute, feed back, repeat.
Multi-Agent Patterns
Complex systems benefit from multiple specialized agents. Here is a pattern for implementing agent handoffs with Claude:
class AgentSystem:
"""Multi-agent system with handoff support."""
def __init__(self):
self.client = Anthropic()
self.agents = {}
self.active_agent = None
self.messages = []
def register_agent(
self,
name: str,
system_prompt: str,
agent_tools: list,
handoff_targets: list[str],
):
"""Register an agent with the system."""
# Add handoff tool dynamically
if handoff_targets:
handoff_tool = {
"name": "handoff",
"description": (
"Hand off the conversation to another agent. "
f"Available targets: {', '.join(handoff_targets)}"
),
"input_schema": {
"type": "object",
"properties": {
"target_agent": {
"type": "string",
"enum": handoff_targets,
},
"reason": {
"type": "string",
"description": "Why the handoff is needed",
},
},
"required": ["target_agent", "reason"],
},
}
agent_tools = agent_tools + [handoff_tool]
self.agents[name] = {
"system_prompt": system_prompt,
"tools": agent_tools,
}
def process(self, user_input: str) -> str:
"""Process a user message through the active agent."""
self.messages.append({
"role": "user",
"content": user_input,
})
for _ in range(15): # max iterations
agent = self.agents[self.active_agent]
response = self.client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=4096,
system=agent["system_prompt"],
tools=agent["tools"],
messages=self.messages,
)
if response.stop_reason == "tool_use":
self.messages.append({
"role": "assistant",
"content": response.content,
})
tool_results = []
for block in response.content:
if block.type != "tool_use":
continue
if block.name == "handoff":
target = block.input["target_agent"]
self.active_agent = target
tool_results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": f"Handed off to {target}",
})
else:
result = execute_tool(
block.name, block.input
)
tool_results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": result,
})
self.messages.append({
"role": "user",
"content": tool_results,
})
else:
text = ""
for block in response.content:
if hasattr(block, "text"):
text += block.text
self.messages.append({
"role": "assistant",
"content": text,
})
return text
return "Maximum iterations reached."
Error Handling in Production
Production agents must handle failures gracefully. Key error categories and their handling strategies:
API Errors
from anthropic import (
APIError,
RateLimitError,
APIConnectionError,
)
import time
def call_claude_with_retry(
messages, system, agent_tools, max_retries=3
):
"""Call Claude with exponential backoff retry."""
for attempt in range(max_retries):
try:
return client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=4096,
system=system,
tools=agent_tools,
messages=messages,
)
except RateLimitError:
wait = 2 ** attempt
time.sleep(wait)
except APIConnectionError:
if attempt == max_retries - 1:
raise
time.sleep(1)
except APIError as e:
# Log and re-raise non-retryable errors
raise
raise Exception("Max retries exceeded")
Tool Execution Errors
Always return errors as structured data rather than raising exceptions. The agent needs error information to decide its next action — retry, try a different approach, or inform the user:
def safe_execute_tool(
tool_name: str, tool_input: dict
) -> str:
"""Execute a tool with error handling."""
try:
result = execute_tool(tool_name, tool_input)
return result
except TimeoutError:
return json.dumps({
"error": "Tool timed out. Try again.",
"tool": tool_name,
})
except Exception as e:
return json.dumps({
"error": f"Tool failed: {str(e)}",
"tool": tool_name,
})
Deployment Considerations
Streaming for Low Latency
In production, stream agent responses to reduce perceived latency. Claude's streaming API sends tokens as they are generated:
with client.messages.stream(
model="claude-sonnet-4-20250514",
max_tokens=4096,
system=system_prompt,
messages=messages,
) as stream:
for text in stream.text_stream:
yield text # Send to client via SSE or WebSocket
Cost Management
Monitor and control costs with these strategies:
- Set
max_tokensappropriately for each use case (do not default to 4096 for simple responses) - Cache frequently used tool results in Redis
- Use Claude 3.5 Haiku for simple routing and classification tasks
- Implement conversation length limits to prevent runaway sessions
- Track per-conversation costs and alert on anomalies
Conversation State Persistence
For production deployments, persist conversation state to a database. At CallSphere, we use PostgreSQL with a JSONB column for the messages array, which allows both efficient storage and flexible querying:
async def save_conversation(
conversation_id: str,
messages: list,
active_agent: str,
):
await db.execute(
"""INSERT INTO conversations
(id, messages, active_agent, updated_at)
VALUES ($1, $2, $3, NOW())
ON CONFLICT (id) DO UPDATE SET
messages = $2,
active_agent = $3,
updated_at = NOW()""",
conversation_id,
json.dumps(messages),
active_agent,
)
Frequently Asked Questions
How does the Claude Agent SDK differ from the OpenAI Agents SDK?
The Claude Agent SDK is tightly integrated with Claude's unique features — extended thinking for complex reasoning, 200K context windows for long documents, and Anthropic's safety-first design philosophy. The OpenAI Agents SDK provides a more opinionated framework with built-in abstractions for the agent loop, handoffs, and guardrails. Choose based on your primary model provider. If you use Claude as your main model, the Anthropic SDK gives you the most direct access to Claude's capabilities with the least abstraction overhead.
Can I use Claude with other agent frameworks like LangGraph?
Yes. Claude works with any framework that supports the Anthropic API or the OpenAI-compatible API format. LangGraph, CrewAI, and AutoGen all have Anthropic integrations. However, some Claude-specific features (like extended thinking) may not be fully exposed through third-party frameworks. If you need those features, use the Anthropic SDK directly for the agent loop and use the framework only for orchestration.
What model should I use for different agent tasks?
Use Claude 3.5 Sonnet for complex reasoning, multi-step tool use, and tasks that require high accuracy. Use Claude 3.5 Haiku for high-volume, latency-sensitive tasks like intent classification, simple Q&A, and routing. For tasks that benefit from deep analysis, enable extended thinking with Claude 3.5 Sonnet — it adds latency but significantly improves accuracy on complex problems.
How do I handle long conversations that exceed the context window?
Implement a sliding window strategy. Keep the system prompt, the first few messages (for context), and the most recent N messages. For the messages in between, generate a summary and include it as a system message. Alternatively, use Claude's 200K context window — most conversations fit comfortably. For truly long-running sessions (multi-day workflows), persist state in a database and load only the relevant context for each interaction.
What are the best practices for writing Claude agent system prompts?
Keep prompts structured and specific. Start with the agent's role and responsibilities, then list behavioral rules, then describe the tools and when to use each one. Use numbered lists for multi-step procedures. Include examples of good responses for ambiguous scenarios. Avoid vague instructions like "be helpful" — instead specify exactly what helpful means in your context. Test prompts against edge cases and adversarial inputs before deploying.
CallSphere Team
Expert insights on AI voice agents and customer communication automation.
Try CallSphere AI Voice Agents
See how AI voice agents work for your industry. Live demo available -- no signup required.