Adding AI Chat to Your SaaS Product: Architecture and Implementation Guide
Learn how to embed an AI chat widget into your SaaS application with proper backend integration, context injection, permission scoping, and conversation management.
Why AI Chat Belongs Inside Your Product
Adding AI chat to a SaaS product is not the same as dropping a third-party chatbot on your marketing site. Product-embedded AI chat needs access to the user's data, must respect their permissions, and should understand the current application context. A customer viewing an invoice should be able to ask "Why is this total different from last month?" and get a real, data-backed answer — not a generic FAQ response.
This guide covers the architecture for building an AI chat system that lives inside your SaaS application as a first-class feature.
Architecture Overview
The system has four layers: the frontend widget, a WebSocket gateway, an AI orchestration service, and your existing product APIs.
# Backend: FastAPI WebSocket endpoint for AI chat
from fastapi import FastAPI, WebSocket, Depends
from typing import Optional
import json
app = FastAPI()
class ChatContext:
"""Captures the user's current product context."""
def __init__(self, user_id: str, tenant_id: str, current_page: str,
entity_type: Optional[str] = None,
entity_id: Optional[str] = None):
self.user_id = user_id
self.tenant_id = tenant_id
self.current_page = current_page
self.entity_type = entity_type
self.entity_id = entity_id
def to_system_prompt(self) -> str:
context = f"User is on page: {self.current_page}."
if self.entity_type and self.entity_id:
context += f" They are viewing {self.entity_type} with ID {self.entity_id}."
return context
@app.websocket("/ws/chat")
async def chat_endpoint(websocket: WebSocket):
await websocket.accept()
# Authenticate from token in first message
auth_msg = await websocket.receive_json()
user = await authenticate_ws_token(auth_msg["token"])
if not user:
await websocket.close(code=4001)
return
while True:
data = await websocket.receive_json()
context = ChatContext(
user_id=user.id,
tenant_id=user.tenant_id,
current_page=data.get("page", "/"),
entity_type=data.get("entity_type"),
entity_id=data.get("entity_id"),
)
response = await generate_ai_response(
message=data["message"],
context=context,
permissions=user.permissions,
)
await websocket.send_json({"reply": response})
Frontend Widget Design
The chat widget mounts as a floating component that tracks the user's current route and sends page context with every message.
// React chat widget that sends page context
import { useEffect, useRef, useState } from "react";
import { usePathname } from "next/navigation";
interface ChatMessage {
role: "user" | "assistant";
content: string;
}
export function AIChatWidget({ authToken }: { authToken: string }) {
const [messages, setMessages] = useState<ChatMessage[]>([]);
const [input, setInput] = useState("");
const wsRef = useRef<WebSocket | null>(null);
const pathname = usePathname();
useEffect(() => {
const ws = new WebSocket(`wss://api.example.com/ws/chat`);
ws.onopen = () => ws.send(JSON.stringify({ token: authToken }));
ws.onmessage = (event) => {
const data = JSON.parse(event.data);
setMessages((prev) => [...prev, { role: "assistant", content: data.reply }]);
};
wsRef.current = ws;
return () => ws.close();
}, [authToken]);
const sendMessage = () => {
if (!input.trim() || !wsRef.current) return;
const payload = {
message: input,
page: pathname,
entity_type: extractEntityType(pathname),
entity_id: extractEntityId(pathname),
};
wsRef.current.send(JSON.stringify(payload));
setMessages((prev) => [...prev, { role: "user", content: input }]);
setInput("");
};
return (
<div className="fixed bottom-4 right-4 w-96 bg-white shadow-xl rounded-lg">
<div className="h-80 overflow-y-auto p-4">
{messages.map((msg, i) => (
<div key={i} className={msg.role === "user" ? "text-right" : "text-left"}>
<p className="inline-block p-2 rounded-lg bg-gray-100">{msg.content}</p>
</div>
))}
</div>
<div className="flex p-2 border-t">
<input value={input} onChange={(e) => setInput(e.target.value)}
className="flex-1 border rounded-l px-3" placeholder="Ask anything..." />
<button onClick={sendMessage} className="bg-blue-600 text-white px-4 rounded-r">
Send
</button>
</div>
</div>
);
}
Permission-Scoped Data Access
The AI must never return data the user is not authorized to see. Inject the user's permission set into the tool layer so every data fetch is scoped.
See AI Voice Agents Handle Real Calls
Book a free demo or calculate how much you can save with AI voice automation.
async def generate_ai_response(message: str, context: ChatContext,
permissions: list[str]) -> str:
tools = build_scoped_tools(context.tenant_id, context.user_id, permissions)
system_prompt = f"""You are a helpful assistant inside our SaaS product.
{context.to_system_prompt()}
Only use the provided tools to fetch data. Never fabricate data.
The user has these permissions: {', '.join(permissions)}.
Do not attempt to access data outside their permission scope."""
response = await call_llm(
system=system_prompt,
messages=[{"role": "user", "content": message}],
tools=tools,
)
return response
def build_scoped_tools(tenant_id: str, user_id: str,
permissions: list[str]) -> list:
tools = []
if "invoices:read" in permissions:
tools.append(InvoiceLookupTool(tenant_id=tenant_id))
if "analytics:read" in permissions:
tools.append(AnalyticsQueryTool(tenant_id=tenant_id))
if "users:read" in permissions:
tools.append(UserDirectoryTool(tenant_id=tenant_id))
return tools
Conversation Management
Store conversations so users can return to previous threads. Use a simple schema with tenant isolation built in.
# SQLAlchemy model for chat history
from sqlalchemy import Column, String, Text, DateTime, ForeignKey
from sqlalchemy.dialects.postgresql import UUID
import uuid
from datetime import datetime
class ChatConversation(Base):
__tablename__ = "chat_conversations"
id = Column(UUID(as_uuid=True), primary_key=True, default=uuid.uuid4)
tenant_id = Column(UUID(as_uuid=True), nullable=False, index=True)
user_id = Column(UUID(as_uuid=True), ForeignKey("users.id"), nullable=False)
title = Column(String(255))
created_at = Column(DateTime, default=datetime.utcnow)
class ChatMessage(Base):
__tablename__ = "chat_messages"
id = Column(UUID(as_uuid=True), primary_key=True, default=uuid.uuid4)
conversation_id = Column(UUID(as_uuid=True),
ForeignKey("chat_conversations.id"), nullable=False, index=True)
role = Column(String(20), nullable=False)
content = Column(Text, nullable=False)
created_at = Column(DateTime, default=datetime.utcnow)
FAQ
How do I prevent the AI from leaking data between tenants?
Every database query and tool invocation must be scoped by tenant_id. Pass the tenant ID from the authenticated session into every tool constructor, and add it as a mandatory WHERE clause. Never rely on the LLM to filter data — enforce it at the data access layer.
Should I use WebSockets or HTTP streaming for chat?
WebSockets are better for bidirectional, long-lived conversations where the server might push updates (typing indicators, tool progress). HTTP streaming with Server-Sent Events works well if your infrastructure does not support WebSocket scaling. For most SaaS products, WebSockets provide the best user experience.
How do I handle rate limiting for the AI chat?
Implement rate limiting at two levels: per-user message rate (e.g., 20 messages per minute) and per-tenant token budget (e.g., 100,000 tokens per day). Track usage in Redis with sliding window counters and return clear error messages when limits are hit.
#AIChat #SaaS #WidgetArchitecture #ContextInjection #Python #TypeScript #AgenticAI #LearnAI #AIEngineering
CallSphere Team
Expert insights on AI voice agents and customer communication automation.
Try CallSphere AI Voice Agents
See how AI voice agents work for your industry. Live demo available -- no signup required.