Prompt Chaining: Breaking Complex Tasks into Sequential LLM Calls
Learn how to decompose complex AI tasks into sequential prompt chains — passing intermediate results between LLM calls, handling errors in pipelines, and building reliable multi-step workflows.
Why Single Prompts Are Not Enough
As tasks grow in complexity, single prompts become unreliable. Asking an LLM to simultaneously analyze data, generate a report, and format it as a structured document invites errors at every level. Prompt chaining solves this by decomposing complex tasks into a sequence of focused LLM calls, where each call handles one well-defined step and passes its output to the next.
This is analogous to Unix pipes — small, composable operations chained together to accomplish complex workflows.
Basic Chain Pattern
The simplest chain passes the output of one call as input to the next:
from openai import OpenAI
client = OpenAI()
def llm_call(system: str, user: str, model: str = "gpt-4o") -> str:
response = client.chat.completions.create(
model=model,
messages=[
{"role": "system", "content": system},
{"role": "user", "content": user},
]
)
return response.choices[0].message.content
def analyze_and_report(raw_data: str) -> dict:
# Step 1: Extract key metrics
metrics = llm_call(
system="Extract numerical metrics from the data. Return as a bullet list of metric: value pairs.",
user=raw_data
)
# Step 2: Analyze trends
analysis = llm_call(
system="You are a data analyst. Analyze the metrics for trends, anomalies, and insights.",
user=f"Metrics:\n{metrics}"
)
# Step 3: Generate executive summary
summary = llm_call(
system="Write a 3-sentence executive summary for a non-technical audience.",
user=f"Analysis:\n{analysis}"
)
return {
"metrics": metrics,
"analysis": analysis,
"summary": summary,
}
Each step has a narrow, clearly defined task. The extraction step does not need to analyze. The analysis step does not need to format for executives. This separation produces better results at every stage.
Building a Chain Pipeline Class
For production systems, formalize chains with a pipeline abstraction:
See AI Voice Agents Handle Real Calls
Book a free demo or calculate how much you can save with AI voice automation.
from dataclasses import dataclass
from typing import Callable
@dataclass
class ChainStep:
name: str
system_prompt: str
input_formatter: Callable[[dict], str]
output_key: str
model: str = "gpt-4o"
class PromptChain:
def __init__(self, steps: list[ChainStep]):
self.steps = steps
self.client = OpenAI()
def run(self, initial_input: str) -> dict:
context = {"initial_input": initial_input}
for step in self.steps:
user_message = step.input_formatter(context)
response = self.client.chat.completions.create(
model=step.model,
messages=[
{"role": "system", "content": step.system_prompt},
{"role": "user", "content": user_message},
]
)
result = response.choices[0].message.content
context[step.output_key] = result
print(f"[{step.name}] completed -> {len(result)} chars")
return context
# Define a review pipeline
review_chain = PromptChain([
ChainStep(
name="extract_code",
system_prompt="Extract all code blocks from the pull request description. Return only the code.",
input_formatter=lambda ctx: ctx["initial_input"],
output_key="code",
),
ChainStep(
name="find_issues",
system_prompt="Review the code for bugs, security issues, and performance problems. List each issue.",
input_formatter=lambda ctx: ctx["code"],
output_key="issues",
),
ChainStep(
name="format_review",
system_prompt="Format the code review issues as a GitHub review comment with severity labels.",
input_formatter=lambda ctx: f"Issues found:\n{ctx['issues']}",
output_key="review",
),
])
results = review_chain.run(pr_description)
print(results["review"])
Error Handling in Chains
A chain is only as strong as its weakest link. Build error handling into the pipeline:
import logging
logger = logging.getLogger(__name__)
class ResilientChain:
def __init__(self, steps: list[ChainStep], max_retries: int = 2):
self.steps = steps
self.max_retries = max_retries
self.client = OpenAI()
def _execute_step(self, step: ChainStep, user_message: str) -> str:
for attempt in range(self.max_retries + 1):
try:
response = self.client.chat.completions.create(
model=step.model,
messages=[
{"role": "system", "content": step.system_prompt},
{"role": "user", "content": user_message},
]
)
result = response.choices[0].message.content
if not result or not result.strip():
raise ValueError("Empty response from LLM")
return result
except Exception as e:
logger.warning(
f"Step '{step.name}' attempt {attempt + 1} failed: {e}"
)
if attempt == self.max_retries:
raise RuntimeError(
f"Step '{step.name}' failed after {self.max_retries + 1} attempts"
) from e
def run(self, initial_input: str) -> dict:
context = {"initial_input": initial_input}
for i, step in enumerate(self.steps):
try:
user_message = step.input_formatter(context)
context[step.output_key] = self._execute_step(step, user_message)
except RuntimeError as e:
logger.error(f"Chain failed at step {i} ({step.name}): {e}")
context["error"] = str(e)
context["failed_step"] = step.name
break
return context
Conditional Branching
Not all chains are linear. Sometimes you need to branch based on intermediate results:
async def classify_and_route(customer_message: str) -> str:
# Step 1: Classify the intent
intent = llm_call(
system="Classify the customer message as: billing, technical, general, or urgent. Return only the category.",
user=customer_message
).strip().lower()
# Step 2: Route to specialized prompt based on classification
specialized_prompts = {
"billing": "You are a billing specialist. Help resolve payment and subscription issues.",
"technical": "You are a senior support engineer. Diagnose and solve technical problems.",
"urgent": "You are an escalation handler. Acknowledge the urgency, gather details, and create a priority ticket.",
"general": "You are a friendly support agent. Answer general questions about our product.",
}
system = specialized_prompts.get(intent, specialized_prompts["general"])
# Step 3: Generate the response with the specialized persona
response = llm_call(system=system, user=customer_message)
return response
This pattern — classify first, then route — is fundamental to building agentic systems. Each branch can use a different model, temperature, or even a different prompt chain.
FAQ
How many steps should a prompt chain have?
Keep chains to 2-5 steps. Each step adds latency and the risk of error compounding. If your chain has more than 5 steps, consider whether some steps can be combined or whether a single well-crafted prompt could replace part of the chain.
How do I debug a failing chain?
Log the full input and output of every step. When a chain produces bad results, inspect each step's output to find where quality degrades. Often the issue is in the input formatting between steps — the output of step N does not match what step N+1 expects.
Is prompt chaining the same as using agents with tools?
No. Prompt chaining is a predefined sequence of calls that you design. Agent tool use is dynamic — the model decides at runtime which tools to call and in what order. Chains are simpler, more predictable, and easier to debug. Use chains when the workflow is known; use agents when the workflow must be discovered.
#PromptChaining #PipelineDesign #LLMOrchestration #PromptEngineering #Python #AgenticAI #LearnAI #AIEngineering
CallSphere Team
Expert insights on AI voice agents and customer communication automation.
Try CallSphere AI Voice Agents
See how AI voice agents work for your industry. Live demo available -- no signup required.