CrewAI Callbacks and Event Hooks: Monitoring Agent Progress in Real Time
Implement step callbacks, task callbacks, and custom event handlers in CrewAI to monitor agent reasoning in real time, log progress, and build observable multi-agent systems.
Why Observability Matters in Multi-Agent Systems
When a single LLM call produces unexpected output, you read the prompt and response. When a crew of five agents runs for three minutes and produces a poor result, debugging is exponentially harder. Which agent went off track? At which step? Did a tool return bad data? Did an agent misinterpret context from a previous task?
CrewAI's callback system solves this by giving you hooks into every step of agent execution. You can log progress, track costs, save intermediate results, send notifications, or halt execution — all without modifying your agent or task definitions.
Task Callbacks
The simplest callback is at the task level. It fires when a task completes and receives the task output:
from crewai import Agent, Task, Crew, Process
import json
from datetime import datetime
def on_task_complete(output):
log_entry = {
"timestamp": datetime.now().isoformat(),
"description": output.description[:80],
"output_length": len(output.raw),
"output_preview": output.raw[:200],
}
print(f"[TASK DONE] {json.dumps(log_entry, indent=2)}")
researcher = Agent(
role="Researcher",
goal="Find accurate data",
backstory="Expert researcher.",
)
task = Task(
description="Research the top 5 AI startups funded in 2026.",
expected_output="A numbered list with company name, funding amount, and focus area.",
agent=researcher,
callback=on_task_complete,
)
The callback receives a TaskOutput object with properties including raw (the string output), description (the task description), and agent (the agent that executed it). This is your primary tool for logging what each task produced.
Step Callbacks
Step callbacks fire at each reasoning step within an agent's execution loop. They provide granular visibility into the agent's thought process, tool calls, and intermediate outputs:
from crewai import Agent
def on_agent_step(step_output):
print(f"[STEP] Agent: {step_output.agent}")
print(f"[STEP] Action: {step_output.action}")
if step_output.tool:
print(f"[STEP] Tool used: {step_output.tool}")
print(f"[STEP] Tool input: {step_output.tool_input}")
print(f"[STEP] Output: {step_output.result[:150]}...")
print("---")
researcher = Agent(
role="Researcher",
goal="Find accurate data using web search",
backstory="Expert online researcher.",
step_callback=on_agent_step,
verbose=True,
)
Step callbacks let you see exactly what the agent is thinking at each iteration. When an agent makes a bad tool call or misinterprets data, the step callback captures the exact moment things went wrong.
Building a Structured Logger
For production systems, combine callbacks with a structured logging system:
See AI Voice Agents Handle Real Calls
Book a free demo or calculate how much you can save with AI voice automation.
import logging
import json
from datetime import datetime
logging.basicConfig(
filename="crew_execution.log",
level=logging.INFO,
format="%(message)s",
)
class CrewLogger:
def __init__(self, crew_name: str):
self.crew_name = crew_name
self.start_time = None
self.task_count = 0
def on_task_start(self):
self.task_count += 1
def on_task_complete(self, output):
entry = {
"crew": self.crew_name,
"event": "task_complete",
"task_number": self.task_count,
"timestamp": datetime.now().isoformat(),
"description": output.description[:100],
"output_chars": len(output.raw),
}
logging.info(json.dumps(entry))
def on_step(self, step_output):
entry = {
"crew": self.crew_name,
"event": "agent_step",
"task_number": self.task_count,
"timestamp": datetime.now().isoformat(),
"action": str(step_output.action)[:100],
}
logging.info(json.dumps(entry))
logger = CrewLogger("market_research")
Use the logger with your agents and tasks:
researcher = Agent(
role="Researcher",
goal="Find data",
backstory="Expert researcher.",
step_callback=logger.on_step,
)
task = Task(
description="Research AI market trends.",
expected_output="A summary of 5 trends.",
agent=researcher,
callback=logger.on_task_complete,
)
This produces a structured log file that can be ingested by any log aggregation system — ELK, Datadog, CloudWatch, or a simple script that parses JSON lines.
Cost Tracking with Callbacks
One of the most practical uses of callbacks is tracking LLM token usage and cost:
class CostTracker:
def __init__(self):
self.total_steps = 0
self.tool_calls = 0
self.tasks_completed = 0
def on_step(self, step_output):
self.total_steps += 1
if step_output.tool:
self.tool_calls += 1
def on_task_complete(self, output):
self.tasks_completed += 1
def summary(self):
return {
"total_steps": self.total_steps,
"tool_calls": self.tool_calls,
"tasks_completed": self.tasks_completed,
"avg_steps_per_task": (
self.total_steps / self.tasks_completed
if self.tasks_completed > 0
else 0
),
}
tracker = CostTracker()
After a crew run, call tracker.summary() to understand how much work each execution required. Track this over time to identify optimization opportunities.
Halting Execution from Callbacks
While CrewAI does not natively support halting execution from a callback, you can raise an exception to stop a run:
class SafetyGuard:
def __init__(self, max_steps: int = 50):
self.max_steps = max_steps
self.step_count = 0
def on_step(self, step_output):
self.step_count += 1
if self.step_count > self.max_steps:
raise RuntimeError(
f"Safety limit reached: {self.max_steps} steps exceeded. "
"Agent may be in a loop."
)
This prevents runaway agents from consuming unlimited tokens. Set the threshold based on your expected task complexity.
FAQ
Can I use async callbacks?
CrewAI's callback system currently expects synchronous functions. If you need to perform async operations (like writing to an async database), use a synchronous wrapper that schedules the async work or writes to a queue that an async consumer processes.
Do callbacks affect agent performance?
Callbacks add negligible overhead — they run between LLM calls, not during them. The LLM inference time dominates execution. A callback that takes 10 milliseconds is invisible when each LLM call takes 1 to 3 seconds.
Can I attach multiple callbacks to the same agent?
Not directly. The step_callback parameter accepts a single function. To run multiple handlers, create a dispatcher function that calls all your handlers sequentially within a single callback.
#CrewAI #Callbacks #Observability #Monitoring #Python #AgenticAI #LearnAI #AIEngineering
CallSphere Team
Expert insights on AI voice agents and customer communication automation.
Try CallSphere AI Voice Agents
See how AI voice agents work for your industry. Live demo available -- no signup required.