Claude API JSON Mode and Structured Output Patterns
Complete guide to getting reliable structured output from Claude. Covers JSON mode, tool-use-as-schema, Pydantic validation, streaming structured data, and error recovery patterns for production applications.
The Structured Output Problem
Getting an LLM to return valid, parseable, schema-compliant JSON is one of the most common challenges in AI engineering. A model that returns beautiful prose cannot power a backend API that expects a specific data structure. Structured output is the bridge between natural language AI and deterministic software systems.
Claude provides multiple approaches to structured output, each with different reliability guarantees and trade-offs. This guide covers all of them with production-ready patterns.
Approach 1: Prompt-Based JSON Output
The simplest approach is to ask Claude to return JSON in your prompt. This works for prototypes but is the least reliable for production.
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=2048,
messages=[{
"role": "user",
"content": """Extract the key information from this job posting.
Job posting: "Senior Backend Engineer at TechCorp. 5+ years Python experience.
Remote-first. $180K-$220K. Must know PostgreSQL and Redis."
Respond with ONLY valid JSON matching this schema:
{
"title": "string",
"company": "string",
"experience_years": number,
"salary_min": number,
"salary_max": number,
"skills": ["string"],
"remote": boolean
}"""
}]
)
import json
try:
data = json.loads(response.content[0].text)
except json.JSONDecodeError:
# Handle malformed JSON - this happens ~5-10% of the time
# with prompt-only approach
pass
Reliability: 90-95% valid JSON. The model sometimes adds markdown formatting, explanatory text, or trailing commas.
Approach 2: Tool Use as Structured Output (Recommended)
The most reliable way to get structured output from Claude is to define a tool with your desired schema and instruct Claude to use it. When Claude calls a tool, it always produces valid JSON matching the tool's input schema.
import anthropic
import json
client = anthropic.Anthropic()
# Define your output schema as a tool
extract_tool = {
"name": "save_job_posting",
"description": "Save the extracted job posting information.",
"input_schema": {
"type": "object",
"properties": {
"title": {
"type": "string",
"description": "Job title"
},
"company": {
"type": "string",
"description": "Company name"
},
"experience_years": {
"type": "integer",
"description": "Minimum years of experience required"
},
"salary_min": {
"type": "integer",
"description": "Minimum salary in USD"
},
"salary_max": {
"type": "integer",
"description": "Maximum salary in USD"
},
"skills": {
"type": "array",
"items": {"type": "string"},
"description": "Required technical skills"
},
"remote": {
"type": "boolean",
"description": "Whether the position is remote"
}
},
"required": [
"title", "company", "experience_years",
"salary_min", "salary_max", "skills", "remote"
]
}
}
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
tools=[extract_tool],
tool_choice={"type": "tool", "name": "save_job_posting"},
messages=[{
"role": "user",
"content": 'Extract info: "Senior Backend Engineer at TechCorp. '
'5+ years Python. Remote. $180K-$220K. PostgreSQL, Redis."'
}]
)
# Extract the structured data from the tool call
for block in response.content:
if block.type == "tool_use":
job_data = block.input # Already a valid Python dict
print(job_data)
Reliability: 99.9%+ valid JSON matching the schema. The tool_choice parameter forces Claude to call the specified tool, guaranteeing structured output.
Key detail: Setting tool_choice: {"type": "tool", "name": "save_job_posting"} forces Claude to use this specific tool. Without it, Claude might respond with text instead.
Approach 3: Pydantic Validation Layer
For production systems, wrap the tool-use approach with Pydantic validation for type safety and business rule enforcement.
from pydantic import BaseModel, Field, field_validator
from typing import Optional
class JobPosting(BaseModel):
title: str = Field(min_length=2, max_length=200)
company: str = Field(min_length=1, max_length=200)
experience_years: int = Field(ge=0, le=50)
salary_min: int = Field(ge=0)
salary_max: int = Field(ge=0)
skills: list[str] = Field(min_length=1)
remote: bool
location: Optional[str] = None
@field_validator("salary_max")
@classmethod
def salary_max_gte_min(cls, v, info):
if "salary_min" in info.data and v < info.data["salary_min"]:
raise ValueError("salary_max must be >= salary_min")
return v
@field_validator("skills")
@classmethod
def deduplicate_skills(cls, v):
return list(dict.fromkeys(v)) # Remove duplicates, preserve order
def extract_structured(text: str) -> JobPosting:
"""Extract structured data with full validation."""
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
tools=[extract_tool],
tool_choice={"type": "tool", "name": "save_job_posting"},
messages=[{"role": "user", "content": f"Extract info: {text}"}]
)
for block in response.content:
if block.type == "tool_use":
return JobPosting(**block.input)
raise ValueError("No tool call in response")
Approach 4: Complex Nested Schemas
For deeply nested output structures, build your tool schema to match.
analysis_tool = {
"name": "save_analysis",
"description": "Save the complete document analysis.",
"input_schema": {
"type": "object",
"properties": {
"summary": {"type": "string"},
"sections": {
"type": "array",
"items": {
"type": "object",
"properties": {
"title": {"type": "string"},
"content_summary": {"type": "string"},
"key_points": {
"type": "array",
"items": {"type": "string"}
},
"risks": {
"type": "array",
"items": {
"type": "object",
"properties": {
"description": {"type": "string"},
"severity": {
"type": "string",
"enum": ["low", "medium", "high", "critical"]
},
"mitigation": {"type": "string"}
},
"required": ["description", "severity"]
}
}
},
"required": ["title", "content_summary", "key_points"]
}
},
"overall_risk_score": {
"type": "number",
"minimum": 0,
"maximum": 10
},
"recommendation": {
"type": "string",
"enum": ["approve", "review_needed", "reject"]
}
},
"required": ["summary", "sections", "overall_risk_score", "recommendation"]
}
}
Error Recovery: Handling Validation Failures
Even with tool use, the extracted values might fail business validation. Implement a retry loop that feeds the error back to Claude.
async def extract_with_retry(
text: str,
schema_model: type[BaseModel],
tool_def: dict,
max_retries: int = 2
) -> BaseModel:
"""Extract structured data with validation retry."""
messages = [{"role": "user", "content": f"Extract information: {text}"}]
for attempt in range(max_retries + 1):
response = await async_client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=2048,
tools=[tool_def],
tool_choice={"type": "tool", "name": tool_def["name"]},
messages=messages
)
tool_input = None
for block in response.content:
if block.type == "tool_use":
tool_input = block.input
break
if tool_input is None:
raise ValueError("No tool call in response")
try:
return schema_model(**tool_input)
except Exception as e:
if attempt < max_retries:
# Feed the validation error back to Claude
messages.append({
"role": "assistant",
"content": response.content
})
messages.append({
"role": "user",
"content": [{
"type": "tool_result",
"tool_use_id": block.id,
"content": f"Validation error: {str(e)}. Please fix and try again.",
"is_error": True
}]
})
else:
raise
Streaming Structured Output
For large structured responses, you can stream the tool call and parse incrementally.
import json
async def stream_structured_output(prompt: str, tool_def: dict) -> dict:
"""Stream a tool call and parse the JSON incrementally."""
json_chunks = []
async with async_client.messages.stream(
model="claude-sonnet-4-20250514",
max_tokens=4096,
tools=[tool_def],
tool_choice={"type": "tool", "name": tool_def["name"]},
messages=[{"role": "user", "content": prompt}]
) as stream:
async for event in stream:
if event.type == "content_block_delta":
if hasattr(event.delta, "partial_json"):
json_chunks.append(event.delta.partial_json)
full_json = "".join(json_chunks)
return json.loads(full_json)
Performance Tips
- Keep schemas flat when possible. Deeply nested schemas increase token usage and latency.
- Use enums for constrained fields.
"enum": ["low", "medium", "high"]is more reliable than asking the model to choose from a list in the description. - Provide clear field descriptions. The
descriptionin each property is part of the prompt Claude sees. Better descriptions produce better extractions. - Use Haiku for simple extractions. For schemas with fewer than 10 flat fields, Haiku is nearly as accurate as Sonnet at a fraction of the cost.
- Batch related extractions. If you need to extract five different pieces of information from one document, define one tool with all five fields rather than making five separate calls.
Summary
Structured output from Claude is a solved problem when you use the right approach. For production systems, the tool-use pattern with tool_choice forcing is the gold standard: it provides 99.9%+ JSON validity, native schema compliance, and works with streaming. Layer Pydantic validation on top for business rule enforcement, and add a retry loop that feeds validation errors back to Claude for the remaining edge cases. This combination delivers reliable structured data extraction that you can build deterministic systems on top of.
NYC News
Expert insights on AI voice agents and customer communication automation.
Try CallSphere AI Voice Agents
See how AI voice agents work for your industry. Live demo available -- no signup required.