Building a Research Agent with Web Search and Report Generation: Complete Tutorial

The Research Agent Use Case

Research is inherently agentic work. A human researcher formulates queries, searches multiple sources, evaluates credibility, extracts key findings, synthesizes information across sources, and produces a coherent report. An AI research agent follows the same workflow but executes it in seconds rather than hours.

In this tutorial, you will build a research agent that accepts a topic, searches the web for relevant information, extracts and validates data from multiple sources, and generates a structured Markdown report. The agent uses a multi-step reasoning loop — it does not just search once and summarize. It iteratively refines its queries based on what it learns.

System Architecture

The research agent uses a three-phase architecture:

Phase 1: Query Expansion
   Topic → Generate 3-5 search queries → Prioritize by specificity

Phase 2: Search and Extract
   For each query → Web search → Extract key claims → Score source credibility

Phase 3: Synthesis and Report
   Deduplicate findings → Cross-reference claims → Generate Markdown report

The agent orchestrates all three phases autonomously, deciding when it has enough information to write the report or when additional searches are needed.

Prerequisites

Python 3.11+
OpenAI API key
Tavily API key for web search (free tier includes 1000 searches/month)

Step 1: Install Dependencies

pip install openai-agents tavily-python httpx beautifulsoup4 markdownify pydantic python-dotenv

Step 2: Build the Web Search Tool

The search tool wraps the Tavily API, which provides clean, structured search results optimized for AI agents:

# tools/web_search.py
from agents import function_tool
from tavily import TavilyClient
import os

tavily = TavilyClient(api_key=os.getenv("TAVILY_API_KEY"))

@function_tool
def web_search(query: str, max_results: int = 5) -> str:
    """Search the web for information on a given query. Returns titles,
    URLs, and content snippets from the top results. Use specific,
    detailed queries for better results."""
    try:
        response = tavily.search(
            query=query,
            max_results=max_results,
            include_raw_content=False,
            search_depth="advanced",
        )
        results = []
        for r in response.get("results", []):
            results.append(
                f"**{r['title']}**\n"
                f"URL: {r['url']}\n"
                f"Score: {r['score']:.2f}\n"
                f"Content: {r['content'][:500]}"
            )
        return "\n\n---\n\n".join(results) if results else "No results found."
    except Exception as e:
        return f"Search error: {str(e)}"

Step 3: Build the Content Extraction Tool

For deeper analysis, the agent needs to extract full content from specific pages:

# tools/extract_content.py
from agents import function_tool
import httpx
from bs4 import BeautifulSoup
from markdownify import markdownify

@function_tool
def extract_page_content(url: str) -> str:
    """Extract and clean the main content from a web page. Use this when
    you need more detail from a search result. Returns clean text content."""
    try:
        headers = {"User-Agent": "ResearchAgent/1.0"}
        response = httpx.get(url, headers=headers, timeout=15, follow_redirects=True)
        response.raise_for_status()

        soup = BeautifulSoup(response.text, "html.parser")

        # Remove noise elements
        for tag in soup(["script", "style", "nav", "footer", "header", "aside"]):
            tag.decompose()

        # Try to find main content
        main = soup.find("main") or soup.find("article") or soup.find("body")
        if not main:
            return "Could not extract content from this page."

        text = markdownify(str(main), heading_style="ATX")

        # Truncate to reasonable length
        if len(text) > 4000:
            text = text[:4000] + "\n\n[Content truncated...]"

        return f"Content from {url}:\n\n{text}"
    except Exception as e:
        return f"Extraction error for {url}: {str(e)}"

Step 4: Build the Report Writer Tool

The report writer formats the agent's findings into a structured Markdown document:

See AI Voice Agents Handle Real Calls

Book a free demo or calculate how much you can save with AI voice automation.

Book a Demo ROI Calculator

# tools/report_writer.py
from agents import function_tool
from datetime import datetime
import os

@function_tool
def write_report(
    title: str,
    executive_summary: str,
    sections: str,
    sources: str,
    output_filename: str = "report.md",
) -> str:
    """Write a formatted Markdown research report to disk. The sections
    parameter should be the full Markdown body. Sources should be a
    numbered list of URLs with titles."""
    report = f"""# {title}

**Generated:** {datetime.now().strftime('%Y-%m-%d %H:%M')}
**Agent:** Research Agent v1.0

## Executive Summary

{executive_summary}

{sections}

## Sources

{sources}

---
*This report was generated by an AI research agent. All claims should be
independently verified before use in decision-making.*
"""
    output_dir = os.getenv("REPORT_OUTPUT_DIR", "./reports")
    os.makedirs(output_dir, exist_ok=True)
    path = os.path.join(output_dir, output_filename)
    with open(path, "w") as f:
        f.write(report)
    return f"Report written to {path} ({len(report)} characters, {report.count(chr(10))} lines)"

Step 5: Build the Query Expansion Tool

This tool helps the agent generate diverse search queries to cover the topic comprehensively:

# tools/query_expander.py
from agents import function_tool

@function_tool
def expand_research_queries(topic: str, num_queries: int = 5) -> str:
    """Generate multiple search queries for a research topic. This tool
    creates diverse queries covering different aspects: definitions,
    recent developments, expert opinions, statistics, and comparisons.
    The agent should use these queries with web_search."""
    aspects = [
        f"{topic} definition overview explained",
        f"{topic} latest developments 2026",
        f"{topic} expert analysis criticism",
        f"{topic} statistics data market size",
        f"{topic} vs alternatives comparison",
        f"{topic} case studies real world examples",
        f"{topic} future predictions trends",
    ]
    queries = aspects[:num_queries]
    return "Suggested search queries:\n" + "\n".join(
        f"{i+1}. {q}" for i, q in enumerate(queries)
    )

Step 6: Assemble the Research Agent

# agent.py
from agents import Agent
from tools.web_search import web_search
from tools.extract_content import extract_page_content
from tools.report_writer import write_report
from tools.query_expander import expand_research_queries

research_agent = Agent(
    name="Research Agent",
    instructions="""You are an expert research agent. When given a topic, you
    conduct thorough research by following this methodology:

    1. PLAN: Use expand_research_queries to generate diverse search queries.
    2. SEARCH: Execute each query using web_search. Evaluate result quality.
    3. DEEP DIVE: For the most promising results, use extract_page_content
       to get full details.
    4. VALIDATE: Cross-reference claims across multiple sources. Note
       disagreements or conflicting data.
    5. SYNTHESIZE: Organize findings into logical sections.
    6. REPORT: Use write_report to generate a formatted Markdown report.

    QUALITY STANDARDS:
    - Every factual claim must be attributable to a source
    - Note confidence levels: high (3+ sources agree), medium (1-2 sources),
      low (single unverified source)
    - Include data and statistics when available
    - Flag any conflicting information between sources
    - Aim for 1000-2000 words in the final report
    """,
    tools=[web_search, extract_page_content, write_report, expand_research_queries],
    model="gpt-4o",
)

Step 7: Create the Runner Script

# run_research.py
import asyncio
import sys
from agents import Runner
from agent import research_agent
from dotenv import load_dotenv

load_dotenv()

async def main():
    topic = " ".join(sys.argv[1:]) if len(sys.argv) > 1 else "AI agent frameworks comparison 2026"

    print(f"Researching: {topic}")
    print("=" * 60)

    result = await Runner.run(
        research_agent,
        f"Research the following topic and produce a comprehensive report: {topic}",
    )

    print("\nAgent trace:")
    for item in result.raw_responses:
        if hasattr(item, "type"):
            print(f"  - {item.type}")

    print(f"\nFinal output:\n{result.final_output}")

if __name__ == "__main__":
    asyncio.run(main())

Run it:

python run_research.py "impact of agentic AI on enterprise software development"

Extending the Agent

The modular tool architecture makes it easy to add capabilities:

Academic search — Add a tool that queries the Semantic Scholar or arXiv APIs for peer-reviewed papers
Data visualization — Add a tool that generates charts using matplotlib and embeds them in the report
Source credibility scoring — Add a tool that checks domain authority and publication date
Citation formatting — Add a tool that formats sources in APA, MLA, or Chicago style

Performance Optimization

For production use, consider these optimizations:

# Run searches concurrently instead of sequentially
import asyncio

async def parallel_search(queries: list[str]):
    tasks = [
        asyncio.to_thread(tavily.search, query=q, max_results=3)
        for q in queries
    ]
    return await asyncio.gather(*tasks)

Cache search results to avoid redundant API calls:

from functools import lru_cache

@lru_cache(maxsize=100)
def cached_search(query: str) -> dict:
    return tavily.search(query=query, max_results=5)

FAQ

How does the agent decide when it has enough information?

The agent uses its built-in reasoning capabilities to evaluate source coverage. The instructions tell it to aim for cross-referenced claims with multiple sources. In practice, it typically performs 5-8 searches before deciding it has sufficient coverage. You can tune this by adjusting the instructions to require a minimum number of sources per claim.

Can I use a different search provider instead of Tavily?

Yes. The search tool is a thin wrapper that can be swapped for any search API. Alternatives include SerpAPI, Brave Search API, or Bing Web Search. Simply replace the Tavily client calls in the web_search tool with your preferred provider's API.

How do I handle rate limits on the search API?

Add exponential backoff to the search tool. Tavily's free tier allows 1000 searches per month. For higher volume, use their paid tier or distribute searches across multiple providers. You can also cache results aggressively since search results for the same query rarely change within a few hours.

What is the typical cost per research report?

A typical report requires 5-8 web searches (approximately $0.005 each on Tavily) and 3-5 page extractions (free, just HTTP requests). The OpenAI API cost for the agent reasoning loop is typically $0.10-0.30 depending on the complexity. Total cost per report is usually under $0.50.