Rich Chat Responses: Cards, Buttons, Carousels, and Interactive Elements

Beyond Plain Text

A chat agent that only returns plain text is like a web application that only renders HTML without CSS or JavaScript. Users expect visual structure — product cards with images, clickable buttons that trigger actions, quick reply chips that guide the conversation, and carousels they can swipe through. Rich responses reduce cognitive load, decrease error rates, and significantly improve conversion rates in sales and support scenarios.

The key architectural insight is that your agent's output is not just a string. It is a structured message object that the frontend interprets and renders differently based on its type.

Defining Message Types

Start with a clear type system for your messages. This is the contract between your backend and frontend:

type MessageType =
  | "text"
  | "card"
  | "button_group"
  | "carousel"
  | "quick_replies"
  | "form"
  | "media";

interface TextMessage {
  type: "text";
  content: string;
}

interface CardMessage {
  type: "card";
  title: string;
  subtitle?: string;
  imageUrl?: string;
  body: string;
  actions: ActionButton[];
}

interface ActionButton {
  label: string;
  action: "link" | "postback" | "call";
  value: string;
}

interface CarouselMessage {
  type: "carousel";
  cards: CardMessage[];
}

interface QuickRepliesMessage {
  type: "quick_replies";
  text: string;
  replies: Array<{ label: string; value: string }>;
}

interface FormMessage {
  type: "form";
  title: string;
  fields: FormField[];
  submitLabel: string;
  submitAction: string;
}

interface FormField {
  name: string;
  label: string;
  type: "text" | "email" | "phone" | "select" | "date";
  required: boolean;
  options?: string[];
}

type ChatMessage =
  | TextMessage
  | CardMessage
  | CarouselMessage
  | QuickRepliesMessage
  | FormMessage;

Backend: Generating Rich Responses

Your AI agent outputs structured JSON instead of raw text. Use tool calls to let the LLM decide when to show rich elements:

from pydantic import BaseModel

class ActionButton(BaseModel):
    label: str
    action: str  # "link", "postback", "call"
    value: str

class CardResponse(BaseModel):
    type: str = "card"
    title: str
    subtitle: str | None = None
    image_url: str | None = None
    body: str
    actions: list[ActionButton]

class QuickRepliesResponse(BaseModel):
    type: str = "quick_replies"
    text: str
    replies: list[dict]

def build_product_card(product: dict) -> dict:
    return CardResponse(
        title=product["name"],
        subtitle=f"${product['price']}/mo",
        image_url=product.get("image_url"),
        body=product["description"],
        actions=[
            ActionButton(label="Learn More", action="link", value=product["url"]),
            ActionButton(label="Start Trial", action="postback", value=f"start_trial:{product['id']}"),
            ActionButton(label="Talk to Sales", action="postback", value="request_demo"),
        ],
    ).model_dump()

def build_quick_replies(text: str, options: list[str]) -> dict:
    return QuickRepliesResponse(
        text=text,
        replies=[{"label": opt, "value": opt.lower().replace(" ", "_")} for opt in options],
    ).model_dump()

Integrate this into your agent's tool definitions so the LLM can choose to display a card when discussing a product or show quick replies when asking a clarifying question:

See AI Voice Agents Handle Real Calls

Book a free demo or calculate how much you can save with AI voice automation.

Book a Demo ROI Calculator

from agents import Agent, function_tool

@function_tool
def show_pricing_plans() -> dict:
    """Display available pricing plans as cards."""
    plans = [
        {"name": "Starter", "price": 29, "description": "Up to 500 conversations/mo",
         "url": "/pricing#starter", "id": "starter", "image_url": "/img/starter.png"},
        {"name": "Pro", "price": 99, "description": "Unlimited conversations, analytics",
         "url": "/pricing#pro", "id": "pro", "image_url": "/img/pro.png"},
    ]
    return {
        "type": "carousel",
        "cards": [build_product_card(p) for p in plans],
    }

@function_tool
def ask_department() -> dict:
    """Ask the user which department they need help with."""
    return build_quick_replies(
        "Which department can I connect you with?",
        ["Sales", "Technical Support", "Billing", "General Inquiry"],
    )

Frontend: Rendering Rich Messages

The React component uses a pattern-matching approach to render each message type:

function MessageRenderer({ message }: { message: ChatMessage }) {
  switch (message.type) {
    case "text":
      return <p className="msg-text">{message.content}</p>;

    case "card":
      return (
        <div className="msg-card">
          {message.imageUrl && <img src={message.imageUrl} alt={message.title} />}
          <h3>{message.title}</h3>
          {message.subtitle && <p className="subtitle">{message.subtitle}</p>}
          <p>{message.body}</p>
          <div className="card-actions">
            {message.actions.map((btn, i) => (
              <button key={i} onClick={() => handleAction(btn)}>
                {btn.label}
              </button>
            ))}
          </div>
        </div>
      );

    case "carousel":
      return (
        <div className="msg-carousel">
          {message.cards.map((card, i) => (
            <MessageRenderer key={i} message={card} />
          ))}
        </div>
      );

    case "quick_replies":
      return (
        <div className="msg-quick-replies">
          <p>{message.text}</p>
          <div className="reply-chips">
            {message.replies.map((r, i) => (
              <button key={i} className="chip"
                onClick={() => sendPostback(r.value)}>
                {r.label}
              </button>
            ))}
          </div>
        </div>
      );

    default:
      return null;
  }
}

Handling Postback Actions

When a user clicks a button or quick reply chip, the frontend sends a postback event instead of a text message. The backend routes these to specific handlers:

async def handle_postback(session_id: str, value: str):
    if value.startswith("start_trial:"):
        plan_id = value.split(":")[1]
        return await initiate_trial(session_id, plan_id)
    elif value == "request_demo":
        return build_quick_replies(
            "When would you like to schedule the demo?",
            ["Today", "Tomorrow", "This Week", "Next Week"],
        )
    elif value in ("sales", "technical_support", "billing"):
        return await escalate_to_department(session_id, value)
    else:
        return {"type": "text", "content": f"Processing your request: {value}"}

FAQ

How do I handle rich messages in channels that only support plain text like SMS?

Build a message serializer per channel. For SMS, flatten a card into text: "Starter Plan - $29/mo - Up to 500 conversations. Reply 1 for Learn More, 2 for Start Trial." Store the mapping between reply numbers and actions server-side so you can interpret "1" as the correct postback. This channel abstraction layer is critical for multi-channel agents.

Should the LLM decide when to show rich elements, or should I use rules?

Use a hybrid approach. Define tools that return rich messages and let the LLM call them based on conversation context. But also add rule-based triggers: if the user asks about pricing, always show the pricing carousel regardless of what the LLM decides. Rules guarantee consistency for critical flows; LLM flexibility handles the long tail.

How do I make carousels accessible?

Ensure keyboard navigation works — users should be able to tab through cards and activate buttons with Enter. Add ARIA labels to the carousel container with role="region" and aria-label="Product options". Each card should be a role="group" with descriptive labels. Screen readers should announce the card count ("Card 1 of 3") as the user navigates.

#RichMessages #ChatUI #Interactive #Cards #QuickReplies #AgenticAI #LearnAI #AIEngineering

Rich Chat Responses: Cards, Buttons, Carousels, and Interactive Elements

Beyond Plain Text

Defining Message Types

Backend: Generating Rich Responses

Frontend: Rendering Rich Messages

Handling Postback Actions

FAQ

How do I handle rich messages in channels that only support plain text like SMS?

Should the LLM decide when to show rich elements, or should I use rules?

How do I make carousels accessible?

Try CallSphere AI Voice Agents

Related Articles

WebArena and Real-World Web Agent Benchmarks: How We Measure Browser Agent Performance

Taking Screenshots and Recording Videos with Playwright for AI Analysis

Playwright Selectors Deep Dive: CSS, XPath, Text, and Role-Based Element Finding