Streaming Text Display in React: Typewriter Effect for AI Agent Responses
Implement token-by-token streaming display for AI agent responses using Server-Sent Events, React state, and cursor animation. Includes markdown rendering during streaming.
Why Streaming Matters for Agent UX
When an AI agent takes 3-8 seconds to generate a full response, showing a blank loading spinner creates anxiety. Streaming tokens as they arrive gives users immediate feedback and makes the agent feel responsive. This pattern — used by ChatGPT, Claude, and every major AI interface — is achieved through Server-Sent Events (SSE) on the backend and incremental state updates on the frontend.
Setting Up the SSE Consumer
The browser EventSource API is simple but limited. It only supports GET requests and cannot send custom headers. For agent APIs that require POST bodies and authentication headers, use the Fetch API with a readable stream instead.
async function* streamAgentResponse(
message: string,
signal: AbortSignal
): AsyncGenerator<string> {
const response = await fetch("/api/agent/chat", {
method: "POST",
headers: {
"Content-Type": "application/json",
Authorization: `Bearer ${getToken()}`,
},
body: JSON.stringify({ message }),
signal,
});
if (!response.ok) {
throw new Error(`Agent error: ${response.status}`);
}
const reader = response.body!.getReader();
const decoder = new TextDecoder();
while (true) {
const { done, value } = await reader.read();
if (done) break;
const chunk = decoder.decode(value, { stream: true });
const lines = chunk.split("\n");
for (const line of lines) {
if (line.startsWith("data: ")) {
const data = line.slice(6);
if (data === "[DONE]") return;
const parsed = JSON.parse(data);
if (parsed.token) {
yield parsed.token;
}
}
}
}
}
The async generator pattern is ideal here. It produces tokens lazily, handles back-pressure naturally, and composes cleanly with React hooks.
The Streaming Hook
Wrap the generator in a custom hook that manages accumulated text, streaming state, and cancellation.
import { useState, useRef, useCallback } from "react";
interface StreamState {
text: string;
isStreaming: boolean;
error: string | null;
}
function useStreamingResponse() {
const [state, setState] = useState<StreamState>({
text: "",
isStreaming: false,
error: null,
});
const abortRef = useRef<AbortController | null>(null);
const startStream = useCallback(async (message: string) => {
abortRef.current?.abort();
const controller = new AbortController();
abortRef.current = controller;
setState({ text: "", isStreaming: true, error: null });
try {
for await (const token of streamAgentResponse(
message,
controller.signal
)) {
setState((prev) => ({
...prev,
text: prev.text + token,
}));
}
setState((prev) => ({ ...prev, isStreaming: false }));
} catch (err) {
if ((err as Error).name !== "AbortError") {
setState((prev) => ({
...prev,
isStreaming: false,
error: (err as Error).message,
}));
}
}
}, []);
const cancel = useCallback(() => {
abortRef.current?.abort();
setState((prev) => ({ ...prev, isStreaming: false }));
}, []);
return { ...state, startStream, cancel };
}
Each token appends to the existing text through a state updater function. This avoids stale closure issues that would occur if you read state.text directly inside the loop.
Rendering Streaming Markdown
During streaming, partial markdown tokens arrive that may not form complete syntax. A naive markdown renderer would flicker between valid and invalid states. The solution: render markdown on every update but debounce expensive operations like syntax highlighting.
See AI Voice Agents Handle Real Calls
Book a free demo or calculate how much you can save with AI voice automation.
import ReactMarkdown from "react-markdown";
interface StreamingMessageProps {
text: string;
isStreaming: boolean;
}
function StreamingMessage({ text, isStreaming }: StreamingMessageProps) {
return (
<div className="prose prose-sm max-w-none">
<ReactMarkdown>{text}</ReactMarkdown>
{isStreaming && <BlinkingCursor />}
</div>
);
}
function BlinkingCursor() {
return (
<span
className="inline-block w-2 h-5 bg-gray-800 ml-0.5 animate-pulse"
aria-hidden="true"
/>
);
}
The BlinkingCursor component creates the familiar typing indicator. The aria-hidden attribute prevents screen readers from announcing the cursor element.
Batching Token Updates for Performance
Setting state on every single token can cause excessive re-renders. If the backend streams tokens at high speed, batch them using requestAnimationFrame.
function useTokenBatcher(
onBatch: (tokens: string) => void
) {
const bufferRef = useRef("");
const rafRef = useRef<number | null>(null);
const addToken = useCallback((token: string) => {
bufferRef.current += token;
if (rafRef.current === null) {
rafRef.current = requestAnimationFrame(() => {
onBatch(bufferRef.current);
bufferRef.current = "";
rafRef.current = null;
});
}
}, [onBatch]);
return addToken;
}
This batches all tokens that arrive within a single animation frame into one state update. Instead of 50 re-renders per second you get at most 60, and each render processes multiple tokens at once.
Cancellation and Cleanup
Users must be able to stop a running stream. The AbortController pattern handles this cleanly. Wire a stop button to the cancel function from the hook.
function ChatControls({
isStreaming,
onCancel,
}: {
isStreaming: boolean;
onCancel: () => void;
}) {
if (!isStreaming) return null;
return (
<button
onClick={onCancel}
className="flex items-center gap-1.5 rounded-lg border
px-3 py-1.5 text-sm hover:bg-gray-50"
>
<span className="w-3 h-3 rounded-sm bg-gray-700" />
Stop generating
</button>
);
}
FAQ
How do I handle code blocks that arrive partially during streaming?
Most markdown renderers handle partial code blocks gracefully by treating unclosed fences as plain text until the closing fence arrives. If you see flicker, wrap your markdown component in React.memo and avoid re-parsing the entire string on every token. Libraries like react-markdown handle incremental content well out of the box.
What is the difference between SSE and WebSockets for streaming?
SSE is unidirectional (server to client), uses plain HTTP, and reconnects automatically. WebSockets are bidirectional and require a persistent connection. For AI agent streaming where the server sends tokens and the client only listens, SSE is simpler and sufficient. Use WebSockets when you need bidirectional communication, such as real-time collaborative editing or push notifications from the agent.
How do I add a copy button for completed responses?
After streaming finishes (isStreaming is false), render a copy button that calls navigator.clipboard.writeText(text). During streaming, hide the copy button to prevent users from copying incomplete content.
#React #Streaming #ServerSentEvents #TypeScript #AIAgentInterface #AgenticAI #LearnAI #AIEngineering
CallSphere Team
Expert insights on AI voice agents and customer communication automation.
Try CallSphere AI Voice Agents
See how AI voice agents work for your industry. Live demo available -- no signup required.