Building a Multi-Agent Research Lab: Scientist, Librarian, Analyst, and Writer Agents
Construct a multi-agent research system with four specialized agents — Scientist, Librarian, Analyst, and Writer — that collaborate on source discovery, analysis, and paper generation with complete Python code.
The Research Lab Concept
Research is inherently a multi-stage process: formulating questions, finding sources, analyzing evidence, and synthesizing findings into a coherent document. A single AI agent attempting all four stages produces shallow results because it cannot specialize — it must juggle search queries, citation tracking, statistical reasoning, and academic writing simultaneously.
A multi-agent research lab assigns each stage to a specialized agent. The Scientist formulates hypotheses and directs research. The Librarian discovers and manages sources. The Analyst evaluates evidence and finds patterns. The Writer synthesizes everything into a structured document. Each agent excels at its narrow responsibility, and the handoffs between them enforce quality gates.
Shared Data Structures
from dataclasses import dataclass, field
from typing import Any, Dict, List, Optional
from enum import Enum
import uuid
@dataclass
class Source:
source_id: str = field(default_factory=lambda: str(uuid.uuid4()))
title: str = ""
url: str = ""
content_summary: str = ""
relevance_score: float = 0.0
source_type: str = "" # "paper", "article", "dataset", "book"
metadata: Dict[str, Any] = field(default_factory=dict)
@dataclass
class ResearchQuestion:
question: str
sub_questions: List[str] = field(default_factory=list)
hypothesis: Optional[str] = None
priority: int = 1
@dataclass
class AnalysisFinding:
claim: str
supporting_sources: List[str] # Source IDs
confidence: float = 0.0 # 0.0 to 1.0
evidence_summary: str = ""
contradicting_sources: List[str] = field(default_factory=list)
@dataclass
class ResearchProject:
topic: str
questions: List[ResearchQuestion] = field(default_factory=list)
sources: List[Source] = field(default_factory=list)
findings: List[AnalysisFinding] = field(default_factory=list)
draft: str = ""
status: str = "initialized"
The Scientist Agent
The Scientist drives the research process. It formulates research questions, evaluates whether enough evidence has been gathered, and decides when the research is complete.
from openai import AsyncOpenAI
import json
client = AsyncOpenAI()
async def scientist_agent(
topic: str, existing_findings: Optional[List[AnalysisFinding]] = None
) -> List[ResearchQuestion]:
context = f"Research topic: {topic}\n"
if existing_findings:
context += "\nExisting findings:\n"
for f in existing_findings:
context += f"- {f.claim} (confidence: {f.confidence})\n"
context += "\nIdentify gaps and generate follow-up questions.\n"
else:
context += "Generate initial research questions and hypotheses.\n"
response = await client.chat.completions.create(
model="gpt-4o",
messages=[
{
"role": "system",
"content": (
"You are a research scientist. Generate structured research "
"questions with sub-questions and hypotheses. Return JSON: "
"questions (list of objects with question, sub_questions, "
"hypothesis, priority)."
),
},
{"role": "user", "content": context},
],
response_format={"type": "json_object"},
)
data = json.loads(response.choices[0].message.content)
return [ResearchQuestion(**q) for q in data["questions"]]
The Librarian Agent
The Librarian handles source discovery and management. It searches for relevant materials, deduplicates sources, and maintains a citation index.
async def librarian_agent(
questions: List[ResearchQuestion],
existing_sources: List[Source],
) -> List[Source]:
existing_titles = {s.title for s in existing_sources}
search_prompt = "Find relevant sources for these research questions:\n"
for q in questions:
search_prompt += f"- {q.question}\n"
for sq in q.sub_questions:
search_prompt += f" - {sq}\n"
if existing_sources:
search_prompt += (
f"\nAlready have {len(existing_sources)} sources. "
"Find complementary sources that fill gaps."
)
response = await client.chat.completions.create(
model="gpt-4o",
messages=[
{
"role": "system",
"content": (
"You are a research librarian. For each research question, "
"suggest relevant academic papers, articles, and datasets. "
"Return JSON: sources (list of objects with title, url, "
"content_summary, relevance_score, source_type)."
),
},
{"role": "user", "content": search_prompt},
],
response_format={"type": "json_object"},
)
data = json.loads(response.choices[0].message.content)
new_sources = []
for s in data["sources"]:
if s["title"] not in existing_titles:
new_sources.append(Source(**s))
return new_sources
The Analyst Agent
The Analyst evaluates evidence across sources, identifies patterns, and produces structured findings with confidence scores.
See AI Voice Agents Handle Real Calls
Book a free demo or calculate how much you can save with AI voice automation.
async def analyst_agent(
questions: List[ResearchQuestion],
sources: List[Source],
) -> List[AnalysisFinding]:
analysis_prompt = "Analyze these sources against the research questions.\n"
analysis_prompt += "\nQUESTIONS:\n"
for q in questions:
analysis_prompt += f"- {q.question} (hypothesis: {q.hypothesis})\n"
analysis_prompt += "\nSOURCES:\n"
for s in sources:
analysis_prompt += (
f"- [{s.source_id[:8]}] {s.title}: {s.content_summary}\n"
)
response = await client.chat.completions.create(
model="gpt-4o",
messages=[
{
"role": "system",
"content": (
"You are a research analyst. Cross-reference sources to "
"produce evidence-based findings. For each finding, cite "
"supporting source IDs and note any contradictions. Return "
"JSON: findings (list of objects with claim, "
"supporting_sources, confidence, evidence_summary, "
"contradicting_sources)."
),
},
{"role": "user", "content": analysis_prompt},
],
response_format={"type": "json_object"},
)
data = json.loads(response.choices[0].message.content)
return [AnalysisFinding(**f) for f in data["findings"]]
The Writer Agent
The Writer synthesizes findings into a structured research document with proper citations.
async def writer_agent(
project: ResearchProject,
) -> str:
write_prompt = f"Topic: {project.topic}\n\n"
write_prompt += "FINDINGS:\n"
for f in project.findings:
write_prompt += (
f"- {f.claim} (confidence: {f.confidence})\n"
f" Evidence: {f.evidence_summary}\n"
)
write_prompt += "\nSOURCES:\n"
for s in project.sources:
write_prompt += f"- [{s.source_id[:8]}] {s.title} ({s.url})\n"
response = await client.chat.completions.create(
model="gpt-4o",
messages=[
{
"role": "system",
"content": (
"You are an academic writer. Synthesize the findings into "
"a structured research document with sections: Abstract, "
"Introduction, Methodology, Findings, Discussion, "
"Conclusion, References. Use inline citations [source_id]. "
"Write in a clear, evidence-based academic style."
),
},
{"role": "user", "content": write_prompt},
],
)
return response.choices[0].message.content
The Research Orchestrator
The orchestrator runs the full research loop, allowing the Scientist to request additional rounds of source gathering and analysis.
async def run_research_lab(
topic: str, max_rounds: int = 3
) -> ResearchProject:
project = ResearchProject(topic=topic)
for round_num in range(1, max_rounds + 1):
print(f"\n--- Research Round {round_num} ---")
# Scientist formulates questions
questions = await scientist_agent(topic, project.findings or None)
project.questions.extend(questions)
# Librarian finds sources
new_sources = await librarian_agent(questions, project.sources)
project.sources.extend(new_sources)
print(f"Found {len(new_sources)} new sources")
# Analyst evaluates evidence
findings = await analyst_agent(questions, project.sources)
project.findings.extend(findings)
# Check if we have sufficient high-confidence findings
high_confidence = [
f for f in project.findings if f.confidence >= 0.7
]
if len(high_confidence) >= 5:
print("Sufficient evidence gathered")
break
# Writer produces the final document
project.draft = await writer_agent(project)
project.status = "completed"
return project
FAQ
How do I integrate real source retrieval instead of LLM-generated sources?
Replace the Librarian agent's LLM call with actual API calls to Google Scholar (via SerpAPI), Semantic Scholar, arXiv, or PubMed. Feed the retrieved abstracts and metadata into the Source dataclass. The Analyst then works with real evidence instead of synthesized summaries. You can also combine both: use the LLM to generate search queries, execute them against real APIs, then let the LLM rank and summarize the results.
How does the Scientist decide when research is "done"?
The Scientist evaluates two criteria: coverage (do the findings address all research questions?) and confidence (are the confidence scores above the threshold?). In the orchestrator above, we stop when we have at least 5 high-confidence findings. In production, you would also check that each research question has at least one finding addressing it.
Can I add a Peer Reviewer agent to improve quality?
Absolutely — add a Peer Reviewer between the Analyst and Writer stages. The Peer Reviewer checks findings for logical consistency, flags unsupported claims, and verifies that citations actually support the claims made. If the review fails, loop back to the Scientist with the reviewer's feedback to trigger another research round targeting the weaknesses identified.
#ResearchAgents #MultiAgentLab #KnowledgeManagement #AIPaperGeneration #ResearchAutomation #AgenticAI #PythonAI #AIResearch
CallSphere Team
Expert insights on AI voice agents and customer communication automation.
Try CallSphere AI Voice Agents
See how AI voice agents work for your industry. Live demo available -- no signup required.