Plan-and-Execute Agents: Separating Planning from Execution for Complex Tasks

Why Separate Planning from Execution?

Most basic AI agents operate in a tight loop: observe, think, act, repeat. This works for simple tasks, but breaks down on complex multi-step problems. The agent gets lost in execution details and loses sight of the overall strategy.

Plan-and-execute agents solve this by introducing a clear separation of concerns. A planner agent creates a high-level plan, and an executor agent carries out each step. After each step, a replanner evaluates progress and adjusts the plan if needed.

This mirrors how experienced engineers work: you sketch out an architecture before writing code, and you revise the plan when you hit unexpected obstacles.

The Architecture

The system has three components:

Planner — takes the original task and produces an ordered list of steps
Executor — takes a single step and executes it using tools or reasoning
Replanner — reviews completed steps and remaining steps, then decides whether to continue, modify, or replace the plan

from pydantic import BaseModel
from openai import OpenAI

client = OpenAI()

class Plan(BaseModel):
    steps: list[str]
    current_step: int = 0

class StepResult(BaseModel):
    step: str
    output: str
    success: bool

def create_plan(task: str) -> Plan:
    """Planner agent: decompose task into ordered steps."""
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {"role": "system", "content": (
                "You are a planning agent. Break the task into 3-7 "
                "concrete, sequential steps. Each step should be "
                "independently executable. Return a JSON list of steps."
            )},
            {"role": "user", "content": f"Task: {task}"},
        ],
        response_format={"type": "json_object"},
    )
    import json
    data = json.loads(response.choices[0].message.content)
    return Plan(steps=data["steps"])

The Executor

The executor focuses on a single step at a time, with access to tools and context from previous steps:

See AI Voice Agents Handle Real Calls

Book a free demo or calculate how much you can save with AI voice automation.

Book a Demo ROI Calculator

def execute_step(step: str, context: list[StepResult]) -> StepResult:
    """Executor agent: carry out a single step."""
    context_str = "\n".join(
        f"Step: {r.step} -> Result: {r.output}" for r in context
    )

    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {"role": "system", "content": (
                "You are an execution agent. Complete the given step "
                "using the context from previous steps. Be precise "
                "and thorough."
            )},
            {"role": "user", "content": (
                f"Previous results:\n{context_str}\n\n"
                f"Current step to execute: {step}"
            )},
        ],
    )
    output = response.choices[0].message.content
    return StepResult(step=step, output=output, success=True)

Replanning on Failure

The real power of this architecture emerges when things go wrong. Instead of blindly continuing, the replanner can adapt:

def replan_if_needed(
    original_task: str,
    plan: Plan,
    results: list[StepResult],
) -> Plan:
    """Replanner: assess progress and adjust the plan."""
    completed = results[-1] if results else None

    if completed and not completed.success:
        response = client.chat.completions.create(
            model="gpt-4o",
            messages=[
                {"role": "system", "content": (
                    "You are a replanning agent. The last step failed. "
                    "Analyze why and create a revised plan for the "
                    "remaining work. You may add, remove, or reorder steps."
                )},
                {"role": "user", "content": (
                    f"Original task: {original_task}\n"
                    f"Failed step: {completed.step}\n"
                    f"Error: {completed.output}\n"
                    f"Remaining steps: {plan.steps[plan.current_step:]}"
                )},
            ],
            response_format={"type": "json_object"},
        )
        import json
        data = json.loads(response.choices[0].message.content)
        return Plan(steps=data["steps"])

    return plan  # no replanning needed

The Orchestration Loop

Tying it all together:

def plan_and_execute(task: str, max_replans: int = 3) -> list[StepResult]:
    plan = create_plan(task)
    results: list[StepResult] = []
    replans = 0

    while plan.current_step < len(plan.steps):
        step = plan.steps[plan.current_step]
        print(f"Executing step {plan.current_step + 1}: {step}")

        result = execute_step(step, results)
        results.append(result)

        if not result.success and replans < max_replans:
            plan = replan_if_needed(task, plan, results)
            replans += 1
            continue

        plan.current_step += 1

    return results

When to Use Plan-and-Execute

This architecture shines for tasks like research reports (plan sections, write each, revise), data pipelines (plan transforms, execute sequentially), and code generation (plan modules, implement each). It adds overhead for simple tasks, so use a standard ReAct agent when the task requires fewer than three steps.

FAQ

How granular should the plan steps be?

Each step should be completable in a single LLM call with tool access. If a step requires sub-planning, it is too coarse. Aim for 3-7 steps for most tasks. The planner can always decompose further during replanning.

How does this compare to ReAct agents?

ReAct interleaves reasoning and action in a single loop. Plan-and-execute separates them explicitly. ReAct is better for exploratory tasks where the path is unclear. Plan-and-execute is better for structured tasks where you can outline the approach upfront.

What happens if replanning keeps failing?

Set a max_replans limit (typically 2-3). If the agent exhausts its replans, return partial results with a clear failure report. In production, this should trigger a human-in-the-loop escalation.

#PlanAndExecute #AgentArchitecture #TaskPlanning #Replanning #AgenticAI #LangGraph #PythonAI #AIEngineering

Plan-and-Execute Agents: Separating Planning from Execution for Complex Tasks

Why Separate Planning from Execution?

The Architecture

The Executor

Replanning on Failure

The Orchestration Loop

When to Use Plan-and-Execute

FAQ

How granular should the plan steps be?

How does this compare to ReAct agents?

What happens if replanning keeps failing?

Try CallSphere AI Voice Agents

Related Articles

WebArena and Real-World Web Agent Benchmarks: How We Measure Browser Agent Performance

Taking Screenshots and Recording Videos with Playwright for AI Analysis

Playwright Selectors Deep Dive: CSS, XPath, Text, and Role-Based Element Finding