What a Research Agent Does

A research agent autonomously investigates a topic by searching for information, reading sources, evaluating credibility, and synthesizing findings into a coherent report. Unlike a simple search-and-summarize pipeline, a research agent iterates: it reads initial sources, identifies gaps or follow-up questions, searches again, and progressively deepens its understanding.

This is one of the most practical and immediately valuable applications of the Claude API. Analysts, journalists, product managers, and investors spend hours manually doing what a well-built research agent can accomplish in minutes.

Architecture

User Query
    |
    v
[Query Planner] -- Decompose into sub-questions
    |
    v
[Search Agent] -- Find relevant sources (loop)
    |
    v
[Reader Agent] -- Extract key information from each source
    |
    v
[Evaluator Agent] -- Assess source credibility and consistency
    |
    v
[Synthesizer Agent] -- Produce final report with citations

Step 1: Query Planning

The first step transforms a broad query into specific, searchable sub-questions:

from anthropic import Anthropic

client = Anthropic()

PLANNER_PROMPT = """You are a research planning agent. Given a research query:
1. Identify the key aspects that need investigation
2. Generate 3-5 specific sub-questions that together would provide a comprehensive answer
3. For each sub-question, suggest search queries that would find relevant information
4. Prioritize sub-questions by importance

Return JSON with this structure:
{
  "main_topic": "...",
  "sub_questions": [
    {
      "question": "...",
      "search_queries": ["...", "..."],
      "priority": 1
    }
  ]
}"""

def plan_research(query: str) -> dict:
    response = client.messages.create(
        model="claude-sonnet-4-5-20250514",
        max_tokens=2048,
        system=PLANNER_PROMPT,
        messages=[{"role": "user", "content": query}],
    )
    return parse_json(response.content[0].text)

Step 2: Web Search Integration

Connect the agent to a search API. Here we use a generic search function that you would implement with your preferred search provider (Brave, Google, Bing, or Tavily):

import httpx
from dataclasses import dataclass

@dataclass
class SearchResult:
    title: str
    url: str
    snippet: str
    source: str

async def search_web(query: str, num_results: int = 5) -> list[SearchResult]:
    """Search the web using your preferred search API."""
    # Example with a generic search API
    async with httpx.AsyncClient() as http:
        response = await http.get(
            "https://api.search-provider.com/search",
            params={"q": query, "count": num_results},
            headers={"Authorization": f"Bearer {SEARCH_API_KEY}"},
        )
        data = response.json()

    return [
        SearchResult(
            title=r["title"],
            url=r["url"],
            snippet=r["snippet"],
            source=extract_domain(r["url"]),
        )
        for r in data["results"]
    ]

Step 3: Source Reader

For each search result, fetch the page content and extract the relevant information:

from bs4 import BeautifulSoup
import httpx

async def fetch_and_extract(url: str) -> str:
    """Fetch a URL and extract clean text content."""
    try:
        async with httpx.AsyncClient(follow_redirects=True, timeout=10.0) as http:
            response = await http.get(url)
            response.raise_for_status()
    except (httpx.HTTPError, httpx.TimeoutException):
        return ""

    soup = BeautifulSoup(response.text, "html.parser")

    # Remove scripts, styles, nav, footer
    for tag in soup(["script", "style", "nav", "footer", "header", "aside"]):
        tag.decompose()

    text = soup.get_text(separator="\n", strip=True)

    # Truncate to avoid exceeding context limits
    max_chars = 10_000
    if len(text) > max_chars:
        text = text[:max_chars] + "\n[Content truncated]"

    return text

READER_PROMPT = """You are a research reader agent. Given a source document and a
specific question, extract all relevant information that helps answer the question.

Rules:
- Only extract information that is directly relevant
- Note specific facts, statistics, dates, and quotes
- Identify the author and publication if available
- Rate the source credibility (1-5): 1=unverified blog, 5=peer-reviewed/official
- Flag any claims that seem unsupported or contradictory

Return JSON:
{
  "relevant_facts": ["...", "..."],
  "key_quotes": ["...", "..."],
  "credibility_score": 4,
  "credibility_notes": "...",
  "gaps": ["Questions this source does not answer"]
}"""

async def read_source(url: str, question: str) -> dict:
    content = await fetch_and_extract(url)
    if not content:
        return {"relevant_facts": [], "credibility_score": 0}

    response = client.messages.create(
        model="claude-haiku-4-5-20250514",  # Haiku is sufficient for extraction
        max_tokens=1024,
        system=READER_PROMPT,
        messages=[{
            "role": "user",
            "content": f"Question: {question}\n\nSource URL: {url}\n\nContent:\n{content}"
        }],
    )
    return parse_json(response.content[0].text)

Step 4: Iterative Deepening

The key differentiator of a research agent versus a simple search pipeline is iteration. After the first round of research, the agent identifies gaps and searches again:

async def research_loop(
    query: str,
    max_iterations: int = 3,
    min_sources: int = 5,
) -> dict:
    """Iterative research loop that deepens understanding."""
    plan = plan_research(query)
    all_findings = []
    searched_urls = set()
    iteration = 0

    for sub_q in plan["sub_questions"]:
        for search_query in sub_q["search_queries"]:
            results = await search_web(search_query)

            for result in results:
                if result.url in searched_urls:
                    continue
                searched_urls.add(result.url)

                findings = await read_source(result.url, sub_q["question"])
                findings["url"] = result.url
                findings["title"] = result.title
                findings["question"] = sub_q["question"]
                all_findings.append(findings)

        iteration += 1
        if iteration >= max_iterations:
            break

    # Check for gaps and do follow-up searches
    gaps = identify_gaps(all_findings, plan)
    if gaps and iteration < max_iterations:
        for gap in gaps[:3]:  # Limit follow-up searches
            follow_up_results = await search_web(gap)
            for result in follow_up_results:
                if result.url not in searched_urls:
                    searched_urls.add(result.url)
                    findings = await read_source(result.url, gap)
                    findings["url"] = result.url
                    findings["title"] = result.title
                    findings["question"] = gap
                    all_findings.append(findings)

    return {
        "plan": plan,
        "findings": all_findings,
        "sources_consulted": len(searched_urls),
        "iterations": iteration,
    }

def identify_gaps(findings: list[dict], plan: dict) -> list[str]:
    """Identify unanswered questions from the research so far."""
    all_gaps = []
    for finding in findings:
        all_gaps.extend(finding.get("gaps", []))
    return list(set(all_gaps))[:5]  # Deduplicate and limit

Step 5: Report Synthesis

The final step synthesizes all findings into a coherent, cited report:

SYNTHESIZER_PROMPT = """You are a research synthesis agent. Given a collection of
findings from multiple sources, produce a comprehensive research report.

Report requirements:
1. Start with an executive summary (2-3 sentences)
2. Organize findings by theme, not by source
3. Cite sources using [Source N] notation
4. Highlight areas of consensus and disagreement between sources
5. Note limitations and areas where more research is needed
6. Include a source bibliography at the end

Quality standards:
- Every factual claim must have a citation
- Clearly distinguish between well-established facts and uncertain claims
- Present multiple perspectives when sources disagree
- Use precise language and avoid hedging unless genuinely uncertain"""

async def synthesize_report(research_data: dict) -> str:
    findings_text = ""
    for i, finding in enumerate(research_data["findings"]):
        findings_text += f"""
Source [{i+1}]: {finding.get('title', 'Unknown')}
URL: {finding['url']}
Credibility: {finding.get('credibility_score', 'N/A')}/5
Question investigated: {finding['question']}
Key facts: {json.dumps(finding.get('relevant_facts', []))}
Key quotes: {json.dumps(finding.get('key_quotes', []))}
---"""

    response = client.messages.create(
        model="claude-sonnet-4-5-20250514",
        max_tokens=8192,
        system=SYNTHESIZER_PROMPT,
        messages=[{
            "role": "user",
            "content": f"""Research topic: {research_data['plan']['main_topic']}

Sources consulted: {research_data['sources_consulted']}

Findings:
{findings_text}

Produce a comprehensive research report."""
        }],
    )
    return response.content[0].text

Complete Pipeline

async def run_research(query: str) -> str:
    """Run the complete research pipeline."""
    print(f"Researching: {query}")

    # Phase 1: Plan
    print("Planning research...")
    research_data = await research_loop(query, max_iterations=3, min_sources=5)
    print(f"Consulted {research_data['sources_consulted']} sources")

    # Phase 2: Synthesize
    print("Synthesizing report...")
    report = await synthesize_report(research_data)

    return report

# Usage
import asyncio
report = asyncio.run(run_research(
    "What are the current best practices for deploying LLM applications in production?"
))
print(report)

Cost Breakdown

For a typical research task consulting 10 sources:

Component	Model	Calls	Avg Tokens	Cost
Query planner	Sonnet	1	1,500	$0.03
Source readers	Haiku	10	3,000 each	$0.04
Gap analysis	Sonnet	1	2,000	$0.04
Report synthesis	Sonnet	1	8,000	$0.15
Total		13	~43,000	$0.26

A comprehensive research report for under $0.30 -- compared to 2-4 hours of manual research at analyst rates.

Improving Quality

Source diversity: Ensure you are not just reading results from the same domain. Explicitly search for opposing viewpoints
Fact verification: Cross-reference key claims across multiple sources before including them in the report
Recency bias: Weight recent sources higher for rapidly evolving topics (technology, policy) but not for established knowledge
Hallucination prevention: The reader agent extracts facts from actual sources; the synthesizer cites those extracted facts. This chain-of-evidence approach significantly reduces fabrication compared to asking Claude to research from its training data alone

Building a Research Agent with the Claude API

What a Research Agent Does

Architecture

Step 1: Query Planning

Step 2: Web Search Integration

Step 3: Source Reader

Step 4: Iterative Deepening

Step 5: Report Synthesis

Complete Pipeline

Cost Breakdown

Improving Quality

Try CallSphere AI Voice Agents

Related Articles

Massive Multitask Language Understanding (MMLU) benchmark evaluates general knowledge and reasoning

Claude Co-Work: How Claude Enables True Collaborative AI Development

Showcasing LLM Performance: How Research Papers Present Evaluation Results