Capstone: Building a Multi-Tenant AI Agent SaaS with Usage-Based Billing
Build a production SaaS platform where multiple tenants can create and deploy AI agents with tenant isolation, a visual agent builder, usage tracking, and Stripe-based usage billing.
SaaS Architecture for AI Agents
Building a multi-tenant AI agent platform requires solving four hard problems simultaneously: tenant isolation (one customer's data and agents must never leak to another), dynamic agent configuration (tenants create agents without writing code), usage metering (track every LLM call, tool invocation, and conversation), and billing (charge based on actual consumption).
This capstone builds a platform where each tenant signs up, creates agents through a web-based builder, deploys them to their own endpoints, and pays based on usage. The architecture uses a shared PostgreSQL database with row-level tenant isolation, a FastAPI backend, and Stripe for billing.
Data Model with Tenant Isolation
Every table includes a tenant_id column. All queries are scoped to the authenticated tenant.
flowchart TD
START["Capstone: Building a Multi-Tenant AI Agent SaaS w…"] --> A
A["SaaS Architecture for AI Agents"]
A --> B
B["Data Model with Tenant Isolation"]
B --> C
C["Tenant-Scoped Dependency Injection"]
C --> D
D["Dynamic Agent Builder"]
D --> E
E["Usage Metering"]
E --> F
F["Stripe Billing Integration"]
F --> G
G["Tenant API Endpoint"]
G --> H
H["FAQ"]
H --> DONE["Key Takeaways"]
style START fill:#4f46e5,stroke:#4338ca,color:#fff
style DONE fill:#059669,stroke:#047857,color:#fff
# models.py
from sqlalchemy import Column, String, Text, Integer, Float, DateTime, ForeignKey
from sqlalchemy.dialects.postgresql import UUID, JSONB
import uuid
class Tenant(Base):
__tablename__ = "tenants"
id = Column(UUID(as_uuid=True), primary_key=True, default=uuid.uuid4)
name = Column(String(200), nullable=False)
slug = Column(String(100), unique=True, nullable=False)
stripe_customer_id = Column(String(100), nullable=True)
plan = Column(String(50), default="free") # free, starter, pro, enterprise
api_key = Column(String(100), unique=True)
created_at = Column(DateTime, server_default="now()")
class AgentConfig(Base):
__tablename__ = "agent_configs"
id = Column(UUID(as_uuid=True), primary_key=True, default=uuid.uuid4)
tenant_id = Column(UUID(as_uuid=True), ForeignKey("tenants.id"), index=True)
name = Column(String(200))
instructions = Column(Text)
model = Column(String(50), default="gpt-4o")
tools = Column(JSONB, default=[]) # list of enabled tool configs
is_active = Column(String(10), default="true")
created_at = Column(DateTime, server_default="now()")
class UsageRecord(Base):
__tablename__ = "usage_records"
id = Column(UUID(as_uuid=True), primary_key=True, default=uuid.uuid4)
tenant_id = Column(UUID(as_uuid=True), ForeignKey("tenants.id"), index=True)
agent_id = Column(UUID(as_uuid=True), ForeignKey("agent_configs.id"))
event_type = Column(String(50)) # "llm_call", "tool_call", "conversation"
tokens_input = Column(Integer, default=0)
tokens_output = Column(Integer, default=0)
cost_cents = Column(Float, default=0)
metadata_ = Column(JSONB, default={})
created_at = Column(DateTime, server_default="now()")
Tenant-Scoped Dependency Injection
Use a FastAPI dependency that extracts the tenant from the API key and scopes all database queries.
# core/auth.py
from fastapi import Depends, HTTPException, Security
from fastapi.security import APIKeyHeader
api_key_header = APIKeyHeader(name="X-API-Key")
async def get_current_tenant(
api_key: str = Security(api_key_header),
db=Depends(get_db),
) -> Tenant:
tenant = db.query(Tenant).filter(Tenant.api_key == api_key).first()
if not tenant:
raise HTTPException(status_code=401, detail="Invalid API key")
return tenant
class TenantScoped:
"""Utility to scope queries to the current tenant."""
def __init__(self, db, tenant: Tenant):
self.db = db
self.tenant_id = tenant.id
def query(self, model):
return self.db.query(model).filter(model.tenant_id == self.tenant_id)
Dynamic Agent Builder
Tenants configure agents through the admin dashboard. The backend loads agent configurations from the database and instantiates them on demand.
See AI Voice Agents Handle Real Calls
Book a free demo or calculate how much you can save with AI voice automation.
# services/agent_factory.py
from agents import Agent, function_tool
# Registry of available tools that tenants can enable
TOOL_REGISTRY = {
"search_kb": search_knowledge_base,
"send_email": send_email_tool,
"create_ticket": create_ticket_tool,
"lookup_order": lookup_order_tool,
"check_calendar": check_calendar_tool,
}
def build_agent_from_config(config: AgentConfig) -> Agent:
"""Dynamically build an Agent from a database configuration."""
enabled_tools = []
for tool_config in config.tools:
tool_name = tool_config["name"]
if tool_name in TOOL_REGISTRY:
enabled_tools.append(TOOL_REGISTRY[tool_name])
return Agent(
name=config.name,
instructions=config.instructions,
model=config.model,
tools=enabled_tools,
)
Usage Metering
Every LLM call and tool invocation is recorded for billing.
# services/metering.py
from datetime import datetime
TOKEN_COSTS = {
"gpt-4o": {"input": 0.25, "output": 1.00}, # per 100k tokens
"gpt-4o-mini": {"input": 0.015, "output": 0.06},
}
async def record_usage(
db, tenant_id: str, agent_id: str,
event_type: str, tokens_in: int, tokens_out: int, model: str
):
costs = TOKEN_COSTS.get(model, TOKEN_COSTS["gpt-4o"])
cost = (tokens_in * costs["input"] + tokens_out * costs["output"]) / 100_000
record = UsageRecord(
tenant_id=tenant_id,
agent_id=agent_id,
event_type=event_type,
tokens_input=tokens_in,
tokens_output=tokens_out,
cost_cents=cost * 100, # store in cents
)
db.add(record)
db.commit()
Stripe Billing Integration
Sync usage to Stripe at the end of each billing period using Stripe metered billing.
# services/billing.py
import stripe
from sqlalchemy import func
from datetime import datetime, timedelta
stripe.api_key = os.environ["STRIPE_SECRET_KEY"]
async def sync_usage_to_stripe(tenant_id: str, db):
"""Report usage to Stripe for metered billing."""
tenant = db.query(Tenant).get(tenant_id)
if not tenant.stripe_customer_id:
return
# Calculate usage since last sync
period_start = datetime.utcnow() - timedelta(days=1)
total_cost = db.query(func.sum(UsageRecord.cost_cents)).filter(
UsageRecord.tenant_id == tenant_id,
UsageRecord.created_at >= period_start,
).scalar() or 0
# Report to Stripe
stripe.billing.MeterEvent.create(
event_name="ai_agent_usage",
payload={
"value": str(int(total_cost)),
"stripe_customer_id": tenant.stripe_customer_id,
},
)
async def get_tenant_usage_summary(tenant_id: str, days: int, db) -> dict:
since = datetime.utcnow() - timedelta(days=days)
records = db.query(UsageRecord).filter(
UsageRecord.tenant_id == tenant_id,
UsageRecord.created_at >= since,
).all()
return {
"total_cost_cents": sum(r.cost_cents for r in records),
"total_llm_calls": sum(1 for r in records if r.event_type == "llm_call"),
"total_tokens_input": sum(r.tokens_input for r in records),
"total_tokens_output": sum(r.tokens_output for r in records),
"total_conversations": sum(1 for r in records if r.event_type == "conversation"),
}
Tenant API Endpoint
Each tenant gets their own agent endpoint, authenticated by their API key.
# routes/agent_api.py
from fastapi import APIRouter
router = APIRouter()
@router.post("/v1/chat")
async def chat(
body: ChatRequest,
tenant: Tenant = Depends(get_current_tenant),
db=Depends(get_db),
):
scoped = TenantScoped(db, tenant)
config = scoped.query(AgentConfig).filter(
AgentConfig.id == body.agent_id
).first()
if not config:
raise HTTPException(404, "Agent not found")
agent = build_agent_from_config(config)
result = await Runner.run(agent, body.message)
# Record usage
usage = result.raw_responses[-1].usage
await record_usage(
db, str(tenant.id), str(config.id),
"llm_call", usage.input_tokens, usage.output_tokens, config.model
)
return {"reply": result.final_output, "agent": config.name}
FAQ
How do I prevent one tenant's heavy usage from affecting others?
Implement per-tenant rate limiting using a Redis-backed token bucket. Each tenant gets a request-per-minute and tokens-per-day limit based on their plan tier. When a tenant exceeds their limit, return a 429 status code with a Retry-After header.
How do I handle tenant data deletion for compliance?
Implement a cascade delete that removes all tenant data: agent configs, usage records, conversations, and any uploaded knowledge base documents. Use a soft-delete first (mark as deleted with a timestamp) and run a hard-delete job after a 30-day grace period. Log the deletion for audit compliance.
How do I let tenants bring their own API keys?
Store tenant-provided API keys encrypted in the database. When building an agent for that tenant, configure the OpenAI client with their key instead of yours. This shifts LLM costs to the tenant while you charge only for platform usage. Validate the key on save by making a minimal API call.
#CapstoneProject #SaaS #MultiTenant #Billing #AgentBuilder #FullStackAI #AgenticAI #LearnAI #AIEngineering
Written by
CallSphere Team
Expert insights on AI voice agents and customer communication automation.
Try CallSphere AI Voice Agents
See how AI voice agents work for your industry. Live demo available -- no signup required.