AI Agent for Customer Onboarding: Guided Setup and Feature Discovery

Why Onboarding Determines Retention

The first 48 hours after signup are the most critical window in a SaaS customer's lifecycle. Users who complete key activation steps within this window retain at dramatically higher rates than those who do not. An AI onboarding agent personalizes this experience: it adapts the setup flow based on the user's role and goals, proactively surfaces relevant features, and intervenes when it detects the user is stuck — all without requiring a human customer success manager for every new account.

Defining the Onboarding Flow

An effective onboarding system starts with a structured flow definition. Each step has completion criteria, dependencies, and contextual help content.

from dataclasses import dataclass, field
from typing import Optional, Callable
from enum import Enum


class StepStatus(Enum):
    LOCKED = "locked"
    AVAILABLE = "available"
    IN_PROGRESS = "in_progress"
    COMPLETED = "completed"
    SKIPPED = "skipped"


@dataclass
class OnboardingStep:
    id: str
    title: str
    description: str
    help_content: str
    required: bool = True
    depends_on: list[str] = field(default_factory=list)
    estimated_minutes: int = 5
    activation_weight: float = 1.0  # importance for activation score


@dataclass
class UserOnboardingState:
    user_id: str
    user_role: str  # e.g., "admin", "developer", "marketer"
    company_type: str
    steps: dict[str, StepStatus] = field(default_factory=dict)
    started_at: Optional[str] = None
    completed_at: Optional[str] = None

    @property
    def progress_pct(self) -> float:
        if not self.steps:
            return 0.0
        completed = sum(
            1 for s in self.steps.values()
            if s == StepStatus.COMPLETED
        )
        total = len(self.steps)
        return round(completed / total * 100, 1)

    @property
    def is_activated(self) -> bool:
        """User is activated when all required steps are done."""
        return all(
            status == StepStatus.COMPLETED or status == StepStatus.SKIPPED
            for step_id, status in self.steps.items()
        )


# Define flows per user role
ONBOARDING_FLOWS: dict[str, list[OnboardingStep]] = {
    "admin": [
        OnboardingStep(
            id="create_workspace",
            title="Create Your Workspace",
            description="Set up your company workspace with name and settings",
            help_content="Your workspace is the container for all your team's data.",
            estimated_minutes=2,
            activation_weight=2.0,
        ),
        OnboardingStep(
            id="invite_team",
            title="Invite Team Members",
            description="Add at least one team member to your workspace",
            help_content="Collaboration increases retention by 3x.",
            depends_on=["create_workspace"],
            estimated_minutes=3,
            activation_weight=1.5,
        ),
        OnboardingStep(
            id="connect_integration",
            title="Connect Your First Integration",
            description="Link your CRM, helpdesk, or communication tool",
            help_content="Integrations unlock automated workflows.",
            depends_on=["create_workspace"],
            estimated_minutes=5,
            activation_weight=2.0,
        ),
        OnboardingStep(
            id="create_first_workflow",
            title="Create Your First Workflow",
            description="Build an automated workflow using a template",
            help_content="Templates help you get started in under 5 minutes.",
            depends_on=["connect_integration"],
            estimated_minutes=10,
            activation_weight=3.0,
        ),
    ],
}

Progress Tracking Service

The onboarding service tracks each user's progress, determines which steps are available next, and computes activation scores.

from datetime import datetime
import json


class OnboardingService:
    def __init__(self, db_pool, redis_client):
        self.pool = db_pool
        self.redis = redis_client

    async def initialize_user(
        self, user_id: str, role: str, company_type: str
    ) -> UserOnboardingState:
        flow = ONBOARDING_FLOWS.get(role, ONBOARDING_FLOWS["admin"])
        steps = {}
        for step in flow:
            if not step.depends_on:
                steps[step.id] = StepStatus.AVAILABLE
            else:
                steps[step.id] = StepStatus.LOCKED

        state = UserOnboardingState(
            user_id=user_id,
            user_role=role,
            company_type=company_type,
            steps=steps,
            started_at=datetime.utcnow().isoformat(),
        )
        await self._save_state(state)
        return state

    async def complete_step(
        self, user_id: str, step_id: str
    ) -> UserOnboardingState:
        state = await self.get_state(user_id)
        state.steps[step_id] = StepStatus.COMPLETED

        # Unlock dependent steps
        flow = ONBOARDING_FLOWS.get(state.user_role, [])
        for step in flow:
            if step_id in step.depends_on:
                all_deps_met = all(
                    state.steps.get(dep) == StepStatus.COMPLETED
                    for dep in step.depends_on
                )
                if all_deps_met:
                    state.steps[step.id] = StepStatus.AVAILABLE

        if state.is_activated:
            state.completed_at = datetime.utcnow().isoformat()

        await self._save_state(state)
        return state

    async def get_state(self, user_id: str) -> UserOnboardingState:
        cached = await self.redis.get(f"onboarding:{user_id}")
        if cached:
            data = json.loads(cached)
            data["steps"] = {
                k: StepStatus(v) for k, v in data["steps"].items()
            }
            return UserOnboardingState(**data)
        # Fall back to database
        row = await self.pool.fetchrow(
            "SELECT state_json FROM onboarding_states WHERE user_id = $1",
            user_id,
        )
        if not row:
            raise ValueError(f"No onboarding state for {user_id}")
        data = json.loads(row["state_json"])
        data["steps"] = {k: StepStatus(v) for k, v in data["steps"].items()}
        return UserOnboardingState(**data)

    async def _save_state(self, state: UserOnboardingState):
        data = {
            "user_id": state.user_id,
            "user_role": state.user_role,
            "company_type": state.company_type,
            "steps": {k: v.value for k, v in state.steps.items()},
            "started_at": state.started_at,
            "completed_at": state.completed_at,
        }
        serialized = json.dumps(data)
        await self.redis.set(
            f"onboarding:{state.user_id}", serialized, ex=86400
        )
        await self.pool.execute(
            """INSERT INTO onboarding_states (user_id, state_json, updated_at)
               VALUES ($1, $2, NOW())
               ON CONFLICT (user_id) DO UPDATE SET state_json = $2, updated_at = NOW()""",
            state.user_id, serialized,
        )

Contextual Help with LLM

When a user asks for help on a specific step, the agent provides targeted guidance based on the step's help content, the user's role, and their company type.

See AI Voice Agents Handle Real Calls

Book a free demo or calculate how much you can save with AI voice automation.

Book a Demo ROI Calculator

from openai import AsyncOpenAI

client = AsyncOpenAI()

HELP_PROMPT = """You are an onboarding assistant for a SaaS product.

The user is a {role} at a {company_type} company.
They are currently on this onboarding step:

Step: {step_title}
Description: {step_description}
Help content: {help_content}

Their question: {question}

Provide a clear, concise answer. If the question is about how to
complete this step, give specific step-by-step instructions.
Keep your response under 150 words.
"""


async def get_contextual_help(
    state: UserOnboardingState,
    current_step_id: str,
    question: str,
) -> str:
    flow = ONBOARDING_FLOWS.get(state.user_role, [])
    step = next((s for s in flow if s.id == current_step_id), None)
    if not step:
        return "I could not find that step. Please try again."

    response = await client.chat.completions.create(
        model="gpt-4o",
        messages=[{
            "role": "user",
            "content": HELP_PROMPT.format(
                role=state.user_role,
                company_type=state.company_type,
                step_title=step.title,
                step_description=step.description,
                help_content=step.help_content,
                question=question,
            ),
        }],
    )
    return response.choices[0].message.content

Activation Metrics and Intervention

Track activation rates and identify users who are stalling. The agent can proactively reach out with targeted nudges.

from datetime import datetime, timedelta


async def find_stalled_users(
    pool, hours_threshold: int = 24
) -> list[dict]:
    cutoff = datetime.utcnow() - timedelta(hours=hours_threshold)
    rows = await pool.fetch(
        """SELECT user_id, state_json, updated_at
           FROM onboarding_states
           WHERE completed_at IS NULL
             AND updated_at < $1
             AND started_at > $2""",
        cutoff,
        datetime.utcnow() - timedelta(days=7),  # only recent signups
    )
    stalled = []
    for row in rows:
        data = json.loads(row["state_json"])
        stalled.append({
            "user_id": row["user_id"],
            "progress": data,
            "stalled_hours": (
                datetime.utcnow() - row["updated_at"]
            ).total_seconds() / 3600,
        })
    return stalled


async def generate_nudge(user_data: dict) -> str:
    """Generate a personalized nudge message for a stalled user."""
    steps = user_data["progress"]["steps"]
    blocked_step = next(
        (k for k, v in steps.items() if v == "available"), None
    )
    response = await client.chat.completions.create(
        model="gpt-4o",
        messages=[{
            "role": "user",
            "content": (
                f"Write a short, friendly nudge email for a user stuck on "
                f"the '{blocked_step}' onboarding step. They signed up "
                f"{user_data['stalled_hours']:.0f} hours ago. "
                f"Keep it under 80 words. Include a direct link placeholder."
            ),
        }],
    )
    return response.choices[0].message.content

FAQ

How do I decide which onboarding steps are required vs optional?

Analyze your activation data. Identify the steps that correlate most strongly with 30-day retention and mark those as required. Steps that improve the experience but do not significantly impact retention should be optional. Assign higher activation weights to the steps that are strongest retention predictors.

Should the onboarding agent replace human customer success managers?

No. The agent handles the standard onboarding path and self-serve users, which frees CSMs to focus on high-value accounts and complex setups. Set triggers that escalate to a human CSM when the agent detects repeated failures on the same step or when the account's ARR exceeds a threshold.

How do I personalize onboarding for different user roles?

Define separate onboarding flows per role as shown in the ONBOARDING_FLOWS dictionary. During signup, capture the user's role and route them to the appropriate flow. Each flow should prioritize the features most relevant to that role — admins need workspace setup, developers need API keys, marketers need campaign tools.

#CustomerOnboarding #ActivationMetrics #GuidedSetup #SaaS #Python #AgenticAI #LearnAI #AIEngineering

AI Agent for Customer Onboarding: Guided Setup and Feature Discovery

Why Onboarding Determines Retention

Defining the Onboarding Flow

Progress Tracking Service

Contextual Help with LLM

Activation Metrics and Intervention

FAQ

How do I decide which onboarding steps are required vs optional?

Should the onboarding agent replace human customer success managers?

How do I personalize onboarding for different user roles?

Try CallSphere AI Voice Agents

Related Articles

WebArena and Real-World Web Agent Benchmarks: How We Measure Browser Agent Performance

Taking Screenshots and Recording Videos with Playwright for AI Analysis

Playwright Selectors Deep Dive: CSS, XPath, Text, and Role-Based Element Finding