Skip to content
Learn Agentic AI12 min read0 views

Streaming Text Display in React: Typewriter Effect for AI Agent Responses

Implement token-by-token streaming display for AI agent responses using Server-Sent Events, React state, and cursor animation. Includes markdown rendering during streaming.

Why Streaming Matters for Agent UX

When an AI agent takes 3-8 seconds to generate a full response, showing a blank loading spinner creates anxiety. Streaming tokens as they arrive gives users immediate feedback and makes the agent feel responsive. This pattern — used by ChatGPT, Claude, and every major AI interface — is achieved through Server-Sent Events (SSE) on the backend and incremental state updates on the frontend.

Setting Up the SSE Consumer

The browser EventSource API is simple but limited. It only supports GET requests and cannot send custom headers. For agent APIs that require POST bodies and authentication headers, use the Fetch API with a readable stream instead.

async function* streamAgentResponse(
  message: string,
  signal: AbortSignal
): AsyncGenerator<string> {
  const response = await fetch("/api/agent/chat", {
    method: "POST",
    headers: {
      "Content-Type": "application/json",
      Authorization: `Bearer ${getToken()}`,
    },
    body: JSON.stringify({ message }),
    signal,
  });

  if (!response.ok) {
    throw new Error(`Agent error: ${response.status}`);
  }

  const reader = response.body!.getReader();
  const decoder = new TextDecoder();

  while (true) {
    const { done, value } = await reader.read();
    if (done) break;

    const chunk = decoder.decode(value, { stream: true });
    const lines = chunk.split("\n");

    for (const line of lines) {
      if (line.startsWith("data: ")) {
        const data = line.slice(6);
        if (data === "[DONE]") return;
        const parsed = JSON.parse(data);
        if (parsed.token) {
          yield parsed.token;
        }
      }
    }
  }
}

The async generator pattern is ideal here. It produces tokens lazily, handles back-pressure naturally, and composes cleanly with React hooks.

The Streaming Hook

Wrap the generator in a custom hook that manages accumulated text, streaming state, and cancellation.

import { useState, useRef, useCallback } from "react";

interface StreamState {
  text: string;
  isStreaming: boolean;
  error: string | null;
}

function useStreamingResponse() {
  const [state, setState] = useState<StreamState>({
    text: "",
    isStreaming: false,
    error: null,
  });
  const abortRef = useRef<AbortController | null>(null);

  const startStream = useCallback(async (message: string) => {
    abortRef.current?.abort();
    const controller = new AbortController();
    abortRef.current = controller;

    setState({ text: "", isStreaming: true, error: null });

    try {
      for await (const token of streamAgentResponse(
        message,
        controller.signal
      )) {
        setState((prev) => ({
          ...prev,
          text: prev.text + token,
        }));
      }
      setState((prev) => ({ ...prev, isStreaming: false }));
    } catch (err) {
      if ((err as Error).name !== "AbortError") {
        setState((prev) => ({
          ...prev,
          isStreaming: false,
          error: (err as Error).message,
        }));
      }
    }
  }, []);

  const cancel = useCallback(() => {
    abortRef.current?.abort();
    setState((prev) => ({ ...prev, isStreaming: false }));
  }, []);

  return { ...state, startStream, cancel };
}

Each token appends to the existing text through a state updater function. This avoids stale closure issues that would occur if you read state.text directly inside the loop.

Rendering Streaming Markdown

During streaming, partial markdown tokens arrive that may not form complete syntax. A naive markdown renderer would flicker between valid and invalid states. The solution: render markdown on every update but debounce expensive operations like syntax highlighting.

See AI Voice Agents Handle Real Calls

Book a free demo or calculate how much you can save with AI voice automation.

import ReactMarkdown from "react-markdown";

interface StreamingMessageProps {
  text: string;
  isStreaming: boolean;
}

function StreamingMessage({ text, isStreaming }: StreamingMessageProps) {
  return (
    <div className="prose prose-sm max-w-none">
      <ReactMarkdown>{text}</ReactMarkdown>
      {isStreaming && <BlinkingCursor />}
    </div>
  );
}

function BlinkingCursor() {
  return (
    <span
      className="inline-block w-2 h-5 bg-gray-800 ml-0.5 animate-pulse"
      aria-hidden="true"
    />
  );
}

The BlinkingCursor component creates the familiar typing indicator. The aria-hidden attribute prevents screen readers from announcing the cursor element.

Batching Token Updates for Performance

Setting state on every single token can cause excessive re-renders. If the backend streams tokens at high speed, batch them using requestAnimationFrame.

function useTokenBatcher(
  onBatch: (tokens: string) => void
) {
  const bufferRef = useRef("");
  const rafRef = useRef<number | null>(null);

  const addToken = useCallback((token: string) => {
    bufferRef.current += token;

    if (rafRef.current === null) {
      rafRef.current = requestAnimationFrame(() => {
        onBatch(bufferRef.current);
        bufferRef.current = "";
        rafRef.current = null;
      });
    }
  }, [onBatch]);

  return addToken;
}

This batches all tokens that arrive within a single animation frame into one state update. Instead of 50 re-renders per second you get at most 60, and each render processes multiple tokens at once.

Cancellation and Cleanup

Users must be able to stop a running stream. The AbortController pattern handles this cleanly. Wire a stop button to the cancel function from the hook.

function ChatControls({
  isStreaming,
  onCancel,
}: {
  isStreaming: boolean;
  onCancel: () => void;
}) {
  if (!isStreaming) return null;

  return (
    <button
      onClick={onCancel}
      className="flex items-center gap-1.5 rounded-lg border
                 px-3 py-1.5 text-sm hover:bg-gray-50"
    >
      <span className="w-3 h-3 rounded-sm bg-gray-700" />
      Stop generating
    </button>
  );
}

FAQ

How do I handle code blocks that arrive partially during streaming?

Most markdown renderers handle partial code blocks gracefully by treating unclosed fences as plain text until the closing fence arrives. If you see flicker, wrap your markdown component in React.memo and avoid re-parsing the entire string on every token. Libraries like react-markdown handle incremental content well out of the box.

What is the difference between SSE and WebSockets for streaming?

SSE is unidirectional (server to client), uses plain HTTP, and reconnects automatically. WebSockets are bidirectional and require a persistent connection. For AI agent streaming where the server sends tokens and the client only listens, SSE is simpler and sufficient. Use WebSockets when you need bidirectional communication, such as real-time collaborative editing or push notifications from the agent.

How do I add a copy button for completed responses?

After streaming finishes (isStreaming is false), render a copy button that calls navigator.clipboard.writeText(text). During streaming, hide the copy button to prevent users from copying incomplete content.


#React #Streaming #ServerSentEvents #TypeScript #AIAgentInterface #AgenticAI #LearnAI #AIEngineering

Share this article
C

CallSphere Team

Expert insights on AI voice agents and customer communication automation.

Try CallSphere AI Voice Agents

See how AI voice agents work for your industry. Live demo available -- no signup required.