Analytics Dashboard for Agent Platform Users: Usage, Performance, and ROI Metrics

Dashboards That Drive Retention

Analytics dashboards are not just features — they are retention tools. When a customer can see that their AI agent handled 2,400 conversations last month with a 94% resolution rate and saved an estimated $18,000 in support costs, they will never cancel. Conversely, a customer who cannot measure the value of their agent will churn at the first budget review.

The key is to surface metrics that answer the question every stakeholder asks: "Is this working?" For a support team lead, "working" means fewer tickets reaching humans. For a CFO, "working" means cost savings. Your dashboard must serve both audiences.

Metric Taxonomy

Organize metrics into four categories that map to different stakeholder concerns:

# metrics.py — Core metric definitions for agent analytics
from dataclasses import dataclass, field
from enum import Enum
from datetime import datetime, timedelta

class MetricCategory(str, Enum):
    USAGE = "usage"
    PERFORMANCE = "performance"
    QUALITY = "quality"
    BUSINESS = "business"

@dataclass
class MetricDefinition:
    key: str
    label: str
    category: MetricCategory
    description: str
    unit: str
    aggregation: str  # "sum", "avg", "p95", "count", "rate"
    higher_is_better: bool

METRICS = [
    MetricDefinition("total_conversations", "Total Conversations", MetricCategory.USAGE,
                     "Number of conversations started", "count", "sum", True),
    MetricDefinition("active_agents", "Active Agents", MetricCategory.USAGE,
                     "Agents that had at least one conversation", "count", "count", True),
    MetricDefinition("avg_response_time", "Avg Response Time", MetricCategory.PERFORMANCE,
                     "Average time from user message to agent response", "ms", "avg", False),
    MetricDefinition("p95_response_time", "P95 Response Time", MetricCategory.PERFORMANCE,
                     "95th percentile response latency", "ms", "p95", False),
    MetricDefinition("resolution_rate", "Resolution Rate", MetricCategory.QUALITY,
                     "Percentage of conversations resolved without human escalation", "%", "rate", True),
    MetricDefinition("avg_satisfaction", "Avg Satisfaction", MetricCategory.QUALITY,
                     "Average user satisfaction score (1-5)", "score", "avg", True),
    MetricDefinition("estimated_savings", "Estimated Cost Savings", MetricCategory.BUSINESS,
                     "Money saved vs manual handling at configured cost per interaction", "$", "sum", True),
    MetricDefinition("cost_per_resolution", "Cost per Resolution", MetricCategory.BUSINESS,
                     "Average LLM + infrastructure cost per resolved conversation", "$", "avg", False),
]

Metric Calculation Engine

The calculation engine queries raw event data and produces aggregated metrics for any time range:

# metric_engine.py — Analytics computation engine
from datetime import datetime
from typing import Optional
import uuid

class MetricEngine:
    def __init__(self, db, usage_store):
        self.db = db
        self.usage_store = usage_store

    async def compute_dashboard(
        self,
        tenant_id: uuid.UUID,
        start: datetime,
        end: datetime,
        agent_id: Optional[uuid.UUID] = None,
    ) -> dict:
        filters = {"tenant_id": tenant_id, "start": start, "end": end}
        agent_clause = ""
        if agent_id:
            filters["agent_id"] = agent_id
            agent_clause = "AND agent_id = :agent_id"

        # Usage metrics
        usage = await self.db.fetch_one(f"""
            SELECT
                COUNT(*) as total_conversations,
                COUNT(DISTINCT agent_id) as active_agents,
                COUNT(DISTINCT DATE(created_at)) as active_days
            FROM conversations
            WHERE tenant_id = :tenant_id
              AND created_at BETWEEN :start AND :end
              {agent_clause}
        """, filters)

        # Performance metrics
        perf = await self.db.fetch_one(f"""
            SELECT
                AVG(response_time_ms) as avg_response_time,
                PERCENTILE_CONT(0.95) WITHIN GROUP (ORDER BY response_time_ms) as p95_response_time
            FROM conversation_messages
            WHERE tenant_id = :tenant_id
              AND role = 'assistant'
              AND created_at BETWEEN :start AND :end
              {agent_clause}
        """, filters)

        # Quality metrics
        quality = await self.db.fetch_one(f"""
            SELECT
                AVG(CASE WHEN escalated = false THEN 1.0 ELSE 0.0 END) * 100 as resolution_rate,
                AVG(satisfaction_score) as avg_satisfaction
            FROM conversations
            WHERE tenant_id = :tenant_id
              AND created_at BETWEEN :start AND :end
              AND status = 'completed'
              {agent_clause}
        """, filters)

        # Business metrics
        total_cost = await self.usage_store.get_total_cost(
            tenant_id, start, end, agent_id
        )
        resolved_count = await self.db.fetch_val(f"""
            SELECT COUNT(*) FROM conversations
            WHERE tenant_id = :tenant_id
              AND created_at BETWEEN :start AND :end
              AND escalated = false AND status = 'completed'
              {agent_clause}
        """, filters)

        cost_per_human = 8.50  # Configurable per tenant
        estimated_savings = (resolved_count or 0) * cost_per_human - (total_cost / 1_000_000)

        return {
            "period": {"start": start.isoformat(), "end": end.isoformat()},
            "usage": dict(usage) if usage else {},
            "performance": {
                "avg_response_time_ms": round(perf["avg_response_time"] or 0, 1),
                "p95_response_time_ms": round(perf["p95_response_time"] or 0, 1),
            },
            "quality": {
                "resolution_rate": round(quality["resolution_rate"] or 0, 1),
                "avg_satisfaction": round(quality["avg_satisfaction"] or 0, 2),
            },
            "business": {
                "total_cost_usd": round(total_cost / 1_000_000, 2),
                "cost_per_resolution_usd": round(
                    (total_cost / 1_000_000) / max(resolved_count, 1), 2
                ),
                "estimated_savings_usd": round(estimated_savings, 2),
            },
        }

Time-Series Data for Charts

Dashboards need charts, and charts need time-series data. The engine provides bucketed data for any metric:

See AI Voice Agents Handle Real Calls

Book a free demo or calculate how much you can save with AI voice automation.

Book a Demo ROI Calculator

# time_series.py — Time-series metric aggregation
class TimeSeriesEngine:
    BUCKET_SIZES = {
        "hour": "date_trunc('hour', created_at)",
        "day": "date_trunc('day', created_at)",
        "week": "date_trunc('week', created_at)",
        "month": "date_trunc('month', created_at)",
    }

    async def get_series(
        self, tenant_id, metric_key, start, end, bucket="day", agent_id=None
    ):
        bucket_expr = self.BUCKET_SIZES.get(bucket, self.BUCKET_SIZES["day"])
        agent_clause = "AND agent_id = :agent_id" if agent_id else ""
        params = {"tenant_id": tenant_id, "start": start, "end": end}
        if agent_id:
            params["agent_id"] = agent_id

        if metric_key == "total_conversations":
            query = f"""
                SELECT {bucket_expr} as bucket, COUNT(*) as value
                FROM conversations
                WHERE tenant_id = :tenant_id
                  AND created_at BETWEEN :start AND :end {agent_clause}
                GROUP BY bucket ORDER BY bucket
            """
        elif metric_key == "resolution_rate":
            query = f"""
                SELECT {bucket_expr} as bucket,
                       AVG(CASE WHEN escalated = false THEN 100.0 ELSE 0.0 END) as value
                FROM conversations
                WHERE tenant_id = :tenant_id
                  AND created_at BETWEEN :start AND :end
                  AND status = 'completed' {agent_clause}
                GROUP BY bucket ORDER BY bucket
            """
        else:
            raise ValueError(f"Unsupported metric for time series: {metric_key}")

        rows = await self.db.fetch_all(query, params)
        return [{"timestamp": row["bucket"].isoformat(), "value": round(row["value"], 2)} for row in rows]

Dashboard API Endpoint

Expose a single endpoint that returns the complete dashboard payload:

# dashboard_routes.py
from fastapi import APIRouter, Depends, Query
from datetime import datetime, timedelta

router = APIRouter(prefix="/v1/analytics")

@router.get("/dashboard")
async def get_dashboard(
    agent_id: str = Query(None),
    period: str = Query("30d"),  # "7d", "30d", "90d", "custom"
    start: datetime = Query(None),
    end: datetime = Query(None),
    tenant=Depends(resolve_tenant),
):
    now = datetime.utcnow()
    if period != "custom":
        days = int(period.replace("d", ""))
        start = now - timedelta(days=days)
        end = now

    dashboard = await metric_engine.compute_dashboard(
        tenant_id=tenant["id"], start=start, end=end, agent_id=agent_id,
    )

    # Add time series for key metrics
    dashboard["series"] = {}
    bucket = "hour" if (end - start).days <= 2 else "day"
    for key in ["total_conversations", "resolution_rate"]:
        dashboard["series"][key] = await time_series_engine.get_series(
            tenant["id"], key, start, end, bucket=bucket, agent_id=agent_id,
        )

    return dashboard

FAQ

How do I calculate ROI when every customer values agent output differently?

Let customers configure their own "cost per manual interaction" value in their account settings. Default to industry benchmarks — $8-12 for support, $25-50 for sales qualification, $15-20 for IT helpdesk. The ROI formula becomes: (resolved_conversations * cost_per_manual) minus (total_platform_cost). Customers who set their own values trust the numbers more.

Should I pre-compute metrics or calculate them on demand?

Use a hybrid approach. Pre-compute daily aggregates in a nightly batch job and store them in a metrics table. For the current day and for custom time ranges, compute on demand. This gives you fast dashboard loads for standard views while supporting arbitrary ad-hoc queries. Cache the on-demand results for 5 minutes.

How do I measure conversation quality beyond resolution rate?

Implement an automated quality scoring pipeline. After each completed conversation, run the transcript through a separate LLM call that scores it on accuracy, helpfulness, tone, and completeness on a 1-5 scale. Store these scores and surface them as quality metrics. This is more reliable than depending on users to submit satisfaction ratings, which have low response rates.

#Analytics #Dashboard #Metrics #AIAgents #DataVisualization #AgenticAI #LearnAI #AIEngineering

Analytics Dashboard for Agent Platform Users: Usage, Performance, and ROI Metrics

Dashboards That Drive Retention

Metric Taxonomy

Metric Calculation Engine

Time-Series Data for Charts

Dashboard API Endpoint

FAQ

How do I calculate ROI when every customer values agent output differently?

Should I pre-compute metrics or calculate them on demand?

How do I measure conversation quality beyond resolution rate?

Try CallSphere AI Voice Agents

Related Articles

WebArena and Real-World Web Agent Benchmarks: How We Measure Browser Agent Performance

Taking Screenshots and Recording Videos with Playwright for AI Analysis

Playwright Selectors Deep Dive: CSS, XPath, Text, and Role-Based Element Finding