CrewAI Callbacks and Event Hooks: Monitoring Agent Progress in Real Time

Why Observability Matters in Multi-Agent Systems

When a single LLM call produces unexpected output, you read the prompt and response. When a crew of five agents runs for three minutes and produces a poor result, debugging is exponentially harder. Which agent went off track? At which step? Did a tool return bad data? Did an agent misinterpret context from a previous task?

CrewAI's callback system solves this by giving you hooks into every step of agent execution. You can log progress, track costs, save intermediate results, send notifications, or halt execution — all without modifying your agent or task definitions.

Task Callbacks

The simplest callback is at the task level. It fires when a task completes and receives the task output:

from crewai import Agent, Task, Crew, Process
import json
from datetime import datetime

def on_task_complete(output):
    log_entry = {
        "timestamp": datetime.now().isoformat(),
        "description": output.description[:80],
        "output_length": len(output.raw),
        "output_preview": output.raw[:200],
    }
    print(f"[TASK DONE] {json.dumps(log_entry, indent=2)}")

researcher = Agent(
    role="Researcher",
    goal="Find accurate data",
    backstory="Expert researcher.",
)

task = Task(
    description="Research the top 5 AI startups funded in 2026.",
    expected_output="A numbered list with company name, funding amount, and focus area.",
    agent=researcher,
    callback=on_task_complete,
)

The callback receives a TaskOutput object with properties including raw (the string output), description (the task description), and agent (the agent that executed it). This is your primary tool for logging what each task produced.

Step Callbacks

Step callbacks fire at each reasoning step within an agent's execution loop. They provide granular visibility into the agent's thought process, tool calls, and intermediate outputs:

from crewai import Agent

def on_agent_step(step_output):
    print(f"[STEP] Agent: {step_output.agent}")
    print(f"[STEP] Action: {step_output.action}")
    if step_output.tool:
        print(f"[STEP] Tool used: {step_output.tool}")
        print(f"[STEP] Tool input: {step_output.tool_input}")
    print(f"[STEP] Output: {step_output.result[:150]}...")
    print("---")

researcher = Agent(
    role="Researcher",
    goal="Find accurate data using web search",
    backstory="Expert online researcher.",
    step_callback=on_agent_step,
    verbose=True,
)

Step callbacks let you see exactly what the agent is thinking at each iteration. When an agent makes a bad tool call or misinterprets data, the step callback captures the exact moment things went wrong.

Building a Structured Logger

For production systems, combine callbacks with a structured logging system:

See AI Voice Agents Handle Real Calls

Book a free demo or calculate how much you can save with AI voice automation.

Book a Demo ROI Calculator

import logging
import json
from datetime import datetime

logging.basicConfig(
    filename="crew_execution.log",
    level=logging.INFO,
    format="%(message)s",
)

class CrewLogger:
    def __init__(self, crew_name: str):
        self.crew_name = crew_name
        self.start_time = None
        self.task_count = 0

    def on_task_start(self):
        self.task_count += 1

    def on_task_complete(self, output):
        entry = {
            "crew": self.crew_name,
            "event": "task_complete",
            "task_number": self.task_count,
            "timestamp": datetime.now().isoformat(),
            "description": output.description[:100],
            "output_chars": len(output.raw),
        }
        logging.info(json.dumps(entry))

    def on_step(self, step_output):
        entry = {
            "crew": self.crew_name,
            "event": "agent_step",
            "task_number": self.task_count,
            "timestamp": datetime.now().isoformat(),
            "action": str(step_output.action)[:100],
        }
        logging.info(json.dumps(entry))

logger = CrewLogger("market_research")

Use the logger with your agents and tasks:

researcher = Agent(
    role="Researcher",
    goal="Find data",
    backstory="Expert researcher.",
    step_callback=logger.on_step,
)

task = Task(
    description="Research AI market trends.",
    expected_output="A summary of 5 trends.",
    agent=researcher,
    callback=logger.on_task_complete,
)

This produces a structured log file that can be ingested by any log aggregation system — ELK, Datadog, CloudWatch, or a simple script that parses JSON lines.

Cost Tracking with Callbacks

One of the most practical uses of callbacks is tracking LLM token usage and cost:

class CostTracker:
    def __init__(self):
        self.total_steps = 0
        self.tool_calls = 0
        self.tasks_completed = 0

    def on_step(self, step_output):
        self.total_steps += 1
        if step_output.tool:
            self.tool_calls += 1

    def on_task_complete(self, output):
        self.tasks_completed += 1

    def summary(self):
        return {
            "total_steps": self.total_steps,
            "tool_calls": self.tool_calls,
            "tasks_completed": self.tasks_completed,
            "avg_steps_per_task": (
                self.total_steps / self.tasks_completed
                if self.tasks_completed > 0
                else 0
            ),
        }

tracker = CostTracker()

After a crew run, call tracker.summary() to understand how much work each execution required. Track this over time to identify optimization opportunities.

Halting Execution from Callbacks

While CrewAI does not natively support halting execution from a callback, you can raise an exception to stop a run:

class SafetyGuard:
    def __init__(self, max_steps: int = 50):
        self.max_steps = max_steps
        self.step_count = 0

    def on_step(self, step_output):
        self.step_count += 1
        if self.step_count > self.max_steps:
            raise RuntimeError(
                f"Safety limit reached: {self.max_steps} steps exceeded. "
                "Agent may be in a loop."
            )

This prevents runaway agents from consuming unlimited tokens. Set the threshold based on your expected task complexity.

FAQ

Can I use async callbacks?

CrewAI's callback system currently expects synchronous functions. If you need to perform async operations (like writing to an async database), use a synchronous wrapper that schedules the async work or writes to a queue that an async consumer processes.

Do callbacks affect agent performance?

Callbacks add negligible overhead — they run between LLM calls, not during them. The LLM inference time dominates execution. A callback that takes 10 milliseconds is invisible when each LLM call takes 1 to 3 seconds.

Can I attach multiple callbacks to the same agent?

Not directly. The step_callback parameter accepts a single function. To run multiple handlers, create a dispatcher function that calls all your handlers sequentially within a single callback.

#CrewAI #Callbacks #Observability #Monitoring #Python #AgenticAI #LearnAI #AIEngineering

CrewAI Callbacks and Event Hooks: Monitoring Agent Progress in Real Time

Why Observability Matters in Multi-Agent Systems

Task Callbacks

Step Callbacks

Building a Structured Logger

Cost Tracking with Callbacks

Halting Execution from Callbacks

FAQ

Can I use async callbacks?

Do callbacks affect agent performance?

Can I attach multiple callbacks to the same agent?

Try CallSphere AI Voice Agents

Related Articles

WebArena and Real-World Web Agent Benchmarks: How We Measure Browser Agent Performance

Taking Screenshots and Recording Videos with Playwright for AI Analysis

Playwright Selectors Deep Dive: CSS, XPath, Text, and Role-Based Element Finding