Building AI Copilots for SaaS: Context-Aware Assistance Within Your Product

What Makes a Copilot Different from a Chatbot

A chatbot waits for questions. A copilot watches what you are doing and offers help before you ask. When you are writing an email in your CRM, the copilot suggests a follow-up template based on the deal stage. When you are building a report, it recommends which metrics to include based on your audience.

The key architectural difference is context capture. A copilot needs a continuous stream of user activity to generate relevant suggestions.

Copilot Architecture

The copilot system has three components: a context collector on the frontend, a suggestion engine on the backend, and a presentation layer that shows suggestions without disrupting the user's workflow.

// Frontend context collector
interface CopilotContext {
  page: string;
  action: string;
  entityType?: string;
  entityId?: string;
  formData?: Record<string, unknown>;
  selectionText?: string;
  timestamp: number;
}

class CopilotContextCollector {
  private buffer: CopilotContext[] = [];
  private ws: WebSocket;
  private flushInterval: ReturnType<typeof setInterval>;

  constructor(wsUrl: string, authToken: string) {
    this.ws = new WebSocket(wsUrl);
    this.ws.onopen = () => {
      this.ws.send(JSON.stringify({ type: "auth", token: authToken }));
    };
    // Flush context every 2 seconds to avoid spamming
    this.flushInterval = setInterval(() => this.flush(), 2000);
  }

  track(ctx: Omit<CopilotContext, "timestamp">) {
    this.buffer.push({ ...ctx, timestamp: Date.now() });
  }

  private flush() {
    if (this.buffer.length === 0) return;
    this.ws.send(JSON.stringify({ type: "context", events: this.buffer }));
    this.buffer = [];
  }

  destroy() {
    clearInterval(this.flushInterval);
    this.ws.close();
  }
}

Backend Suggestion Engine

The suggestion engine receives context events, maintains a rolling window of user activity, and generates suggestions when activity patterns match known triggers.

from dataclasses import dataclass, field
from datetime import datetime, timedelta
from collections import deque
import asyncio

@dataclass
class UserSession:
    user_id: str
    tenant_id: str
    context_window: deque = field(default_factory=lambda: deque(maxlen=50))
    last_suggestion_time: datetime = field(default_factory=datetime.utcnow)

class SuggestionEngine:
    def __init__(self, llm_client, min_suggestion_interval: int = 30):
        self.sessions: dict[str, UserSession] = {}
        self.llm_client = llm_client
        self.min_interval = timedelta(seconds=min_suggestion_interval)

    def get_session(self, user_id: str, tenant_id: str) -> UserSession:
        if user_id not in self.sessions:
            self.sessions[user_id] = UserSession(
                user_id=user_id, tenant_id=tenant_id
            )
        return self.sessions[user_id]

    async def process_context(self, user_id: str, tenant_id: str,
                               events: list[dict]) -> dict | None:
        session = self.get_session(user_id, tenant_id)
        for event in events:
            session.context_window.append(event)

        # Rate limit suggestions
        now = datetime.utcnow()
        if now - session.last_suggestion_time < self.min_interval:
            return None

        trigger = self.detect_trigger(session)
        if not trigger:
            return None

        suggestion = await self.generate_suggestion(session, trigger)
        session.last_suggestion_time = now
        return suggestion

    def detect_trigger(self, session: UserSession) -> str | None:
        recent = list(session.context_window)[-5:]
        if not recent:
            return None

        latest = recent[-1]

        # Trigger: user is editing a form for more than 30 seconds
        if latest.get("action") == "form_edit":
            edit_events = [e for e in recent if e.get("action") == "form_edit"]
            if len(edit_events) >= 3:
                return "form_assistance"

        # Trigger: user is viewing a record with incomplete data
        if latest.get("action") == "view" and latest.get("entityType"):
            return "record_insight"

        return None

    async def generate_suggestion(self, session: UserSession,
                                   trigger: str) -> dict:
        context_summary = self.summarize_context(session)
        prompt = f"""Based on the user's activity, generate a helpful suggestion.
Trigger: {trigger}
Context: {context_summary}
Respond with JSON: {{"title": "...", "body": "...", "actions": [...]}}"""

        response = await self.llm_client.chat(
            messages=[{"role": "user", "content": prompt}],
            response_format={"type": "json_object"},
        )
        return response

    def summarize_context(self, session: UserSession) -> str:
        recent = list(session.context_window)[-10:]
        lines = []
        for event in recent:
            lines.append(
                f"[{event.get('action')}] on {event.get('entityType', 'page')}"
                f" ({event.get('page', '/')})"
            )
        return "\n".join(lines)

Presenting Suggestions Without Disrupting Workflow

Suggestions should appear in a non-modal side panel. Users must always be able to dismiss, accept, or modify them.

See AI Voice Agents Handle Real Calls

Book a free demo or calculate how much you can save with AI voice automation.

Book a Demo ROI Calculator

// React copilot suggestion component
import { useState, useEffect } from "react";

interface Suggestion {
  id: string;
  title: string;
  body: string;
  actions: { label: string; action: string; payload?: Record<string, unknown> }[];
}

export function CopilotPanel({ ws }: { ws: WebSocket }) {
  const [suggestions, setSuggestions] = useState<Suggestion[]>([]);

  useEffect(() => {
    const handler = (event: MessageEvent) => {
      const data = JSON.parse(event.data);
      if (data.type === "suggestion") {
        setSuggestions((prev) => [data.suggestion, ...prev].slice(0, 5));
      }
    };
    ws.addEventListener("message", handler);
    return () => ws.removeEventListener("message", handler);
  }, [ws]);

  const dismiss = (id: string) => {
    setSuggestions((prev) => prev.filter((s) => s.id !== id));
    ws.send(JSON.stringify({ type: "feedback", suggestion_id: id, action: "dismiss" }));
  };

  const accept = (id: string, action: string) => {
    ws.send(JSON.stringify({ type: "feedback", suggestion_id: id, action: "accept" }));
    // Execute the action through your app's action system
    executeAction(action);
    dismiss(id);
  };

  return (
    <div className="w-80 border-l bg-gray-50 p-4 overflow-y-auto">
      <h3 className="font-semibold text-sm text-gray-600 mb-3">Copilot Suggestions</h3>
      {suggestions.map((s) => (
        <div key={s.id} className="bg-white rounded-lg shadow-sm p-3 mb-2">
          <h4 className="font-medium text-sm">{s.title}</h4>
          <p className="text-xs text-gray-600 mt-1">{s.body}</p>
          <div className="flex gap-2 mt-2">
            {s.actions.map((a) => (
              <button key={a.label} onClick={() => accept(s.id, a.action)}
                className="text-xs bg-blue-600 text-white px-2 py-1 rounded">
                {a.label}
              </button>
            ))}
            <button onClick={() => dismiss(s.id)}
              className="text-xs text-gray-400 ml-auto">Dismiss</button>
          </div>
        </div>
      ))}
    </div>
  );
}

User Control: The Non-Negotiable Principle

Every copilot suggestion must be an offer, never an automatic action. Users must be able to dismiss any suggestion, disable the copilot entirely, and configure what triggers suggestions. Store preferences per user and respect them on every request.

# User preference storage for copilot behavior
async def get_copilot_preferences(db, user_id: str) -> dict:
    row = await db.fetchrow(
        "SELECT preferences FROM copilot_settings WHERE user_id = $1",
        user_id
    )
    defaults = {
        "enabled": True,
        "triggers": ["form_assistance", "record_insight", "workflow_tip"],
        "frequency": "normal",  # low, normal, high
        "dismissed_categories": [],
    }
    if not row:
        return defaults
    stored = row["preferences"]
    return {**defaults, **stored}

FAQ

How do I avoid annoying users with too many suggestions?

Implement three controls: a minimum interval between suggestions (30-60 seconds), a daily suggestion cap per user, and a feedback loop that tracks dismissal rates. If a user dismisses more than 70% of a specific suggestion type, stop showing that type automatically.

Should the copilot have access to all user data?

The copilot should only access data the user can already see. Use the same permission system as your main application. Additionally, avoid sending sensitive fields (SSNs, passwords, API keys) to the LLM even if the user has access — redact them before context injection.

How do I measure copilot effectiveness?

Track three metrics: suggestion acceptance rate (target above 30%), time saved per accepted suggestion (measure task completion time with and without the copilot), and user satisfaction via periodic micro-surveys embedded in the copilot panel.

#AICopilot #SaaS #ContextAwareAI #SuggestionEngine #Python #TypeScript #AgenticAI #LearnAI #AIEngineering

Building AI Copilots for SaaS: Context-Aware Assistance Within Your Product

What Makes a Copilot Different from a Chatbot

Copilot Architecture

Backend Suggestion Engine

Presenting Suggestions Without Disrupting Workflow

User Control: The Non-Negotiable Principle

FAQ

How do I avoid annoying users with too many suggestions?

Should the copilot have access to all user data?

How do I measure copilot effectiveness?

Try CallSphere AI Voice Agents

Related Articles

WebArena and Real-World Web Agent Benchmarks: How We Measure Browser Agent Performance

Taking Screenshots and Recording Videos with Playwright for AI Analysis

Playwright Selectors Deep Dive: CSS, XPath, Text, and Role-Based Element Finding