Capstone: Building an AI-Powered Sales Development Representative (SDR)
Build an end-to-end AI sales development representative that ingests leads, generates personalized outreach, manages follow-up sequences, and syncs activity to your CRM using agent orchestration.
What an AI SDR Does
A sales development representative qualifies leads, writes personalized outreach emails, follows up persistently, and books meetings. An AI SDR automates this entire workflow while maintaining the personalization that makes outreach effective. This capstone builds a system that ingests leads from multiple sources, researches each prospect, generates personalized multi-step email sequences, manages follow-up timing, and syncs all activity to a CRM.
The architecture has four components: a lead ingestion service that normalizes leads from CSV uploads, webhooks, and CRM imports; a research agent that enriches leads with company and contact data; a copywriting agent that generates personalized email sequences; and a campaign engine that sends emails on schedule and handles replies.
Data Model
# models.py
from sqlalchemy import Column, String, Text, Integer, DateTime, ForeignKey, Enum
from sqlalchemy.dialects.postgresql import UUID, JSONB
import uuid, enum
class LeadStatus(str, enum.Enum):
NEW = "new"
RESEARCHED = "researched"
SEQUENCED = "sequenced"
REPLIED = "replied"
BOOKED = "booked"
DISQUALIFIED = "disqualified"
class Lead(Base):
__tablename__ = "leads"
id = Column(UUID(as_uuid=True), primary_key=True, default=uuid.uuid4)
email = Column(String(255), unique=True, nullable=False)
name = Column(String(200))
company = Column(String(200))
title = Column(String(200))
linkedin_url = Column(String(500))
status = Column(Enum(LeadStatus), default=LeadStatus.NEW)
research_data = Column(JSONB, default={}) # enrichment results
source = Column(String(100)) # "csv", "webhook", "crm"
created_at = Column(DateTime, server_default="now()")
class EmailSequence(Base):
__tablename__ = "email_sequences"
id = Column(UUID(as_uuid=True), primary_key=True, default=uuid.uuid4)
lead_id = Column(UUID(as_uuid=True), ForeignKey("leads.id"))
step_number = Column(Integer)
subject = Column(String(500))
body = Column(Text)
send_at = Column(DateTime)
sent = Column(DateTime, nullable=True)
opened = Column(DateTime, nullable=True)
replied = Column(DateTime, nullable=True)
class CRMActivity(Base):
__tablename__ = "crm_activities"
id = Column(UUID(as_uuid=True), primary_key=True, default=uuid.uuid4)
lead_id = Column(UUID(as_uuid=True), ForeignKey("leads.id"))
activity_type = Column(String(50)) # "email_sent", "reply_received", "meeting_booked"
details = Column(JSONB)
synced_to_crm = Column(DateTime, nullable=True)
created_at = Column(DateTime, server_default="now()")
Lead Research Agent
The research agent enriches a lead with publicly available information about their company and role. It uses web search and LinkedIn scraping tools to gather context.
# agents/research_agent.py
from agents import Agent, function_tool
@function_tool
def search_company_info(company_name: str) -> str:
"""Search for company information including size, industry, and recent news."""
results = web_search(f"{company_name} company overview funding news")
return summarize_results(results[:3])
@function_tool
def get_linkedin_summary(linkedin_url: str) -> str:
"""Retrieve public LinkedIn profile summary."""
profile = linkedin_scraper.get_profile(linkedin_url)
return f"Title: {profile.title}, About: {profile.summary[:300]}"
@function_tool
def save_research(lead_id: str, research_json: str) -> str:
"""Save research data to the lead record."""
lead = db.query(Lead).get(lead_id)
lead.research_data = json.loads(research_json)
lead.status = LeadStatus.RESEARCHED
db.commit()
return "Research saved."
research_agent = Agent(
name="Lead Research Agent",
instructions="""Research the given lead. Find their company size, industry,
recent funding or news, and the contact's role and responsibilities.
Save a structured JSON summary with keys: company_size, industry,
recent_news, role_summary, pain_points.""",
tools=[search_company_info, get_linkedin_summary, save_research],
)
Email Copywriting Agent
The copywriting agent uses the research data to generate a personalized multi-step email sequence.
# agents/copywriter_agent.py
from agents import Agent, function_tool
@function_tool
def save_email_sequence(lead_id: str, emails_json: str) -> str:
"""Save a multi-step email sequence for the lead."""
emails = json.loads(emails_json)
for i, email in enumerate(emails):
seq = EmailSequence(
lead_id=lead_id,
step_number=i + 1,
subject=email["subject"],
body=email["body"],
send_at=calculate_send_time(i),
)
db.add(seq)
lead = db.query(Lead).get(lead_id)
lead.status = LeadStatus.SEQUENCED
db.commit()
return f"Saved {len(emails)} email sequence."
copywriter_agent = Agent(
name="Email Copywriter",
instructions="""Write a 3-step email sequence for the lead.
Use their research data for personalization.
Step 1: Introduction with a relevant pain point hook (send immediately).
Step 2: Value proposition with a case study reference (send after 3 days).
Step 3: Soft breakup email with a clear CTA (send after 5 days).
Keep each email under 150 words. Use a conversational tone.
Output as JSON array with keys: subject, body.""",
tools=[save_email_sequence],
)
Campaign Engine
The campaign engine runs as a background task that sends emails on schedule and processes inbound replies.
See AI Voice Agents Handle Real Calls
Book a free demo or calculate how much you can save with AI voice automation.
# services/campaign_engine.py
from datetime import datetime
import asyncio
async def send_pending_emails():
"""Send all emails that are due."""
pending = db.query(EmailSequence).filter(
EmailSequence.send_at <= datetime.utcnow(),
EmailSequence.sent.is_(None),
).all()
for seq in pending:
lead = db.query(Lead).get(seq.lead_id)
# Skip if lead has already replied
if lead.status == LeadStatus.REPLIED:
continue
await email_client.send(
to=lead.email,
subject=seq.subject,
body=seq.body,
reply_to="sdr@yourdomain.com",
)
seq.sent = datetime.utcnow()
log_crm_activity(lead.id, "email_sent", {
"step": seq.step_number, "subject": seq.subject
})
db.commit()
async def process_reply(from_email: str, body: str):
"""Handle an inbound reply to an outreach email."""
lead = db.query(Lead).filter(Lead.email == from_email).first()
if not lead:
return
lead.status = LeadStatus.REPLIED
# Cancel remaining sequence emails
db.query(EmailSequence).filter(
EmailSequence.lead_id == lead.id,
EmailSequence.sent.is_(None),
).delete()
log_crm_activity(lead.id, "reply_received", {"body": body[:500]})
db.commit()
CRM Sync
Sync all activities to your CRM (HubSpot, Salesforce, etc.) using a periodic batch sync.
# services/crm_sync.py
async def sync_to_hubspot():
"""Sync unsynced activities to HubSpot."""
unsynced = db.query(CRMActivity).filter(
CRMActivity.synced_to_crm.is_(None)
).limit(100).all()
for activity in unsynced:
lead = db.query(Lead).get(activity.lead_id)
await hubspot_client.create_engagement(
contact_email=lead.email,
engagement_type=activity.activity_type,
body=json.dumps(activity.details),
)
activity.synced_to_crm = datetime.utcnow()
db.commit()
FAQ
How do I prevent the AI from sending embarrassing or off-brand emails?
Implement a review queue between the copywriting agent and the campaign engine. New sequences start in a "pending_review" status. The admin dashboard shows pending sequences for human approval. Once approved, they move to "active" and the campaign engine begins sending. Over time, as confidence grows, you can auto-approve sequences that score above a quality threshold.
How do I handle email deliverability?
Use a dedicated sending domain with proper SPF, DKIM, and DMARC records. Warm up the domain by starting with low volume and increasing gradually. Track bounce rates and automatically disqualify leads with bounced emails. Use a service like SendGrid or Amazon SES that handles deliverability infrastructure.
How do I A/B test different email approaches?
Generate two variations per sequence step using the copywriting agent. Randomly assign leads to variant A or B. Track open rates and reply rates per variant. After reaching statistical significance (typically 100+ sends per variant), automatically prefer the winning approach for future sequences.
#CapstoneProject #SalesAI #SDRAutomation #EmailOutreach #CRMIntegration #FullStackAI #AgenticAI #LearnAI #AIEngineering
CallSphere Team
Expert insights on AI voice agents and customer communication automation.
Try CallSphere AI Voice Agents
See how AI voice agents work for your industry. Live demo available -- no signup required.