Skip to content
Agentic AI
Agentic AI6 min read14 views

Multi-Step AI Workflows: Orchestrating Claude Across Complex Tasks

Learn patterns for orchestrating Claude across multi-step workflows including sequential chains, parallel fan-out, conditional branching, and human-in-the-loop checkpoints. Includes production-ready Python examples.

Why Single-Call AI Is Not Enough

Most AI integrations start as a single API call: user sends input, model returns output, done. But real business processes are multi-step. Reviewing a contract involves extracting clauses, checking against policy, flagging risks, and generating a summary. Onboarding a customer requires validating documents, running compliance checks, creating accounts, and sending notifications.

Orchestrating Claude across multi-step workflows is the difference between "AI feature" and "AI-powered system." The challenge is not making individual calls, it is managing state, handling failures, and coordinating parallel and sequential steps efficiently.

The Four Orchestration Patterns

Pattern 1: Sequential Chain

The simplest pattern. Each step's output feeds into the next step's input.

flowchart TD
    START["Multi-Step AI Workflows: Orchestrating Claude Acr…"] --> A
    A["Why Single-Call AI Is Not Enough"]
    A --> B
    B["The Four Orchestration Patterns"]
    B --> C
    C["Error Handling and Retry Strategies"]
    C --> D
    D["Cost Optimization: Model Routing Per St…"]
    D --> E
    E["Summary"]
    E --> DONE["Key Takeaways"]
    style START fill:#4f46e5,stroke:#4338ca,color:#fff
    style DONE fill:#059669,stroke:#047857,color:#fff
import anthropic
from dataclasses import dataclass

client = anthropic.Anthropic()

@dataclass
class StepResult:
    step_name: str
    output: str
    tokens_used: int
    model: str

async def sequential_chain(document: str) -> list[StepResult]:
    """Process a document through a sequential analysis chain."""
    results = []

    # Step 1: Extract key information
    extraction = client.messages.create(
        model="claude-haiku-4-20250514",  # Fast model for extraction
        max_tokens=2048,
        messages=[{
            "role": "user",
            "content": f"Extract all dates, names, monetary amounts, and "
                       f"obligations from this document:\n\n{document}"
        }]
    )
    results.append(StepResult(
        step_name="extraction",
        output=extraction.content[0].text,
        tokens_used=extraction.usage.output_tokens,
        model="claude-haiku-4-20250514"
    ))

    # Step 2: Analyze risks (uses extraction output)
    risk_analysis = client.messages.create(
        model="claude-sonnet-4-20250514",  # Stronger model for analysis
        max_tokens=4096,
        messages=[{
            "role": "user",
            "content": f"Given these extracted elements:\n{extraction.content[0].text}"
                       f"\n\nIdentify potential risks, ambiguities, and "
                       f"missing clauses in this contract."
        }]
    )
    results.append(StepResult(
        step_name="risk_analysis",
        output=risk_analysis.content[0].text,
        tokens_used=risk_analysis.usage.output_tokens,
        model="claude-sonnet-4-20250514"
    ))

    # Step 3: Generate summary (uses both previous outputs)
    summary = client.messages.create(
        model="claude-sonnet-4-20250514",
        max_tokens=2048,
        messages=[{
            "role": "user",
            "content": f"Create an executive summary of this contract review."
                       f"\n\nExtracted elements:\n{extraction.content[0].text}"
                       f"\n\nRisk analysis:\n{risk_analysis.content[0].text}"
        }]
    )
    results.append(StepResult(
        step_name="summary",
        output=summary.content[0].text,
        tokens_used=summary.usage.output_tokens,
        model="claude-sonnet-4-20250514"
    ))

    return results

When to use: Tasks with clear linear dependencies where each step requires the previous step's output.

Pattern 2: Parallel Fan-Out / Fan-In

When multiple independent analyses can run simultaneously, fan-out to parallel calls and fan-in to combine results.

See AI Voice Agents Handle Real Calls

Book a free demo or calculate how much you can save with AI voice automation.

import asyncio
from anthropic import AsyncAnthropic

async_client = AsyncAnthropic()

async def parallel_analysis(document: str) -> dict:
    """Run multiple independent analyses in parallel."""

    async def analyze(aspect: str, instructions: str) -> tuple[str, str]:
        response = await async_client.messages.create(
            model="claude-sonnet-4-20250514",
            max_tokens=2048,
            messages=[{
                "role": "user",
                "content": f"{instructions}\n\nDocument:\n{document}"
            }]
        )
        return aspect, response.content[0].text

    # Fan-out: run all analyses concurrently
    tasks = [
        analyze("legal", "Identify all legal obligations and liabilities."),
        analyze("financial", "Extract and analyze all financial terms."),
        analyze("compliance", "Check for regulatory compliance issues."),
        analyze("timeline", "Extract all deadlines and milestones."),
    ]
    results = await asyncio.gather(*tasks)

    # Fan-in: combine results
    analysis_map = dict(results)

    # Synthesis step: combine all parallel results
    synthesis = await async_client.messages.create(
        model="claude-sonnet-4-20250514",
        max_tokens=4096,
        messages=[{
            "role": "user",
            "content": (
                "Synthesize these analyses into a unified report:\n\n"
                + "\n\n".join(
                    f"## {k.title()} Analysis\n{v}"
                    for k, v in analysis_map.items()
                )
            )
        }]
    )

    return {
        "individual_analyses": analysis_map,
        "synthesis": synthesis.content[0].text
    }

When to use: Multiple independent analyses of the same input, where a final synthesis step combines the results.

Pattern 3: Conditional Branching

Different inputs require different processing paths. A routing step decides which branch to execute.

import json

async def conditional_workflow(user_request: str) -> dict:
    """Route and process requests based on AI classification."""

    # Step 1: Classify the request
    classification = await async_client.messages.create(
        model="claude-haiku-4-20250514",
        max_tokens=256,
        messages=[{
            "role": "user",
            "content": f"""Classify this request into exactly one category.
Categories: billing, technical_support, account_change, general_inquiry

Request: {user_request}

Respond with JSON: {{"category": "...", "confidence": 0.0-1.0}}"""
        }]
    )

    route = json.loads(classification.content[0].text)

    # Step 2: Branch based on classification
    branch_configs = {
        "billing": {
            "model": "claude-sonnet-4-20250514",
            "system": "You are a billing specialist. Access account data via tools.",
            "tools": billing_tools,
        },
        "technical_support": {
            "model": "claude-sonnet-4-20250514",
            "system": "You are a technical support engineer. Diagnose and resolve issues.",
            "tools": tech_support_tools,
        },
        "account_change": {
            "model": "claude-sonnet-4-20250514",
            "system": "You are an account manager. Process account modifications.",
            "tools": account_tools,
        },
        "general_inquiry": {
            "model": "claude-haiku-4-20250514",
            "system": "You are a helpful assistant. Answer general questions.",
            "tools": [],
        },
    }

    config = branch_configs.get(route["category"], branch_configs["general_inquiry"])

    # Step 3: Execute the appropriate branch
    response = await async_client.messages.create(
        model=config["model"],
        system=config["system"],
        max_tokens=4096,
        tools=config["tools"],
        messages=[{"role": "user", "content": user_request}]
    )

    return {
        "classification": route,
        "response": response.content[0].text,
        "branch_used": route["category"]
    }

Pattern 4: Human-in-the-Loop Checkpoint

For high-stakes workflows, insert approval gates where a human reviews the AI's work before proceeding.

from enum import Enum

class ApprovalStatus(Enum):
    PENDING = "pending"
    APPROVED = "approved"
    REJECTED = "rejected"
    MODIFIED = "modified"

async def workflow_with_checkpoints(task: str) -> dict:
    """Execute a workflow with human approval checkpoints."""

    # Step 1: AI generates a plan
    plan = await async_client.messages.create(
        model="claude-sonnet-4-20250514",
        max_tokens=4096,
        messages=[{
            "role": "user",
            "content": f"Create a detailed execution plan for: {task}\n"
                       f"List each step with expected outcomes and risks."
        }]
    )

    # Checkpoint: save plan and wait for human approval
    checkpoint_id = await save_checkpoint(
        stage="plan_review",
        content=plan.content[0].text,
        requires_approval=True
    )

    # In production, this would be async (webhook, polling, queue)
    approval = await wait_for_approval(checkpoint_id)

    if approval.status == ApprovalStatus.REJECTED:
        return {"status": "rejected", "reason": approval.feedback}

    # Use the potentially modified plan
    approved_plan = approval.modified_content or plan.content[0].text

    # Step 2: Execute the approved plan
    execution = await async_client.messages.create(
        model="claude-sonnet-4-20250514",
        max_tokens=8096,
        messages=[{
            "role": "user",
            "content": f"Execute this approved plan:\n{approved_plan}"
        }]
    )

    return {"status": "completed", "result": execution.content[0].text}

Error Handling and Retry Strategies

Multi-step workflows need robust error handling because any step can fail.

import time
from anthropic import APIError, RateLimitError

async def resilient_step(
    messages: list,
    model: str = "claude-sonnet-4-20250514",
    max_retries: int = 3,
    fallback_model: str = "claude-haiku-4-20250514"
) -> str:
    """Execute a step with retries and model fallback."""
    for attempt in range(max_retries):
        try:
            response = await async_client.messages.create(
                model=model,
                max_tokens=4096,
                messages=messages
            )
            return response.content[0].text

        except RateLimitError:
            wait_time = 2 ** attempt  # Exponential backoff
            time.sleep(wait_time)

        except APIError as e:
            if attempt == max_retries - 1 and fallback_model:
                # Last resort: try a different model
                response = await async_client.messages.create(
                    model=fallback_model,
                    max_tokens=4096,
                    messages=messages
                )
                return response.content[0].text
            raise

    raise RuntimeError(f"Step failed after {max_retries} retries")

Cost Optimization: Model Routing Per Step

One of the biggest advantages of multi-step workflows is using the right model for each step. Not every step needs the most capable model.

Step Type Recommended Model Why
Classification / routing Haiku Fast, cheap, highly accurate for simple decisions
Data extraction Haiku or Sonnet Structured extraction is well-handled by smaller models
Complex analysis Sonnet Good balance of capability and cost
Critical decisions Opus Highest accuracy for high-stakes reasoning
Synthesis / writing Sonnet Strong writing quality at reasonable cost

A typical workflow using model routing costs 40-60% less than using Sonnet for every step, with no measurable quality degradation.

Summary

Multi-step AI workflows transform Claude from a question-answering tool into a process automation engine. The four core patterns, sequential chains, parallel fan-out, conditional branching, and human-in-the-loop, can be combined to model almost any business process. The keys to production success are robust error handling with fallbacks, model routing for cost optimization, and checkpoint-based human oversight for high-stakes decisions.

Share
C

Written by

CallSphere Team

Expert insights on AI voice agents and customer communication automation.

Try CallSphere AI Voice Agents

See how AI voice agents work for your industry. Live demo available -- no signup required.