Building AI Agents with Next.js API Routes: Full-Stack Agent Applications

Why Next.js for AI Agent Applications

Next.js provides the rare combination of a React frontend, a server-side API layer, and deployment infrastructure in a single framework. For AI agent applications, this means you can define your agent logic in API routes, stream responses to React components, and deploy everything as one unit — no separate backend service required.

The App Router's route handlers, combined with the Vercel AI SDK or raw streaming APIs, make Next.js one of the fastest paths from idea to deployed agent application.

Basic Agent API Route

Create a route handler that processes messages and returns agent responses:

// app/api/agent/route.ts
import { NextRequest, NextResponse } from "next/server";
import OpenAI from "openai";

const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

export async function POST(req: NextRequest) {
  const { messages, threadId } = await req.json();

  if (!messages || !Array.isArray(messages)) {
    return NextResponse.json(
      { error: "messages array is required" },
      { status: 400 }
    );
  }

  const completion = await client.chat.completions.create({
    model: "gpt-4o",
    messages: [
      { role: "system", content: "You are a helpful assistant." },
      ...messages,
    ],
  });

  return NextResponse.json({
    message: completion.choices[0].message,
    usage: completion.usage,
  });
}

Streaming Responses from API Routes

For real-time UIs, stream tokens instead of waiting for the full response:

// app/api/agent/stream/route.ts
import { NextRequest } from "next/server";
import OpenAI from "openai";

const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

export async function POST(req: NextRequest) {
  const { messages } = await req.json();

  const stream = await client.chat.completions.create({
    model: "gpt-4o",
    messages,
    stream: true,
  });

  const encoder = new TextEncoder();

  const readable = new ReadableStream({
    async start(controller) {
      for await (const chunk of stream) {
        const text = chunk.choices[0]?.delta?.content;
        if (text) {
          controller.enqueue(
            encoder.encode(`data: ${JSON.stringify({ text })}

`)
          );
        }
      }
      controller.enqueue(encoder.encode("data: [DONE]

"));
      controller.close();
    },
  });

  return new Response(readable, {
    headers: {
      "Content-Type": "text/event-stream",
      "Cache-Control": "no-cache",
      Connection: "keep-alive",
    },
  });
}

This implements Server-Sent Events (SSE) manually. The client connects to this endpoint and receives tokens as they arrive from the LLM.

Authentication Middleware

Protect your agent endpoints with middleware that validates session tokens:

// middleware.ts
import { NextResponse } from "next/server";
import type { NextRequest } from "next/server";

export function middleware(request: NextRequest) {
  if (request.nextUrl.pathname.startsWith("/api/agent")) {
    const authHeader = request.headers.get("authorization");

    if (!authHeader?.startsWith("Bearer ")) {
      return NextResponse.json(
        { error: "Authentication required" },
        { status: 401 }
      );
    }

    // Validate the token (JWT verification, database lookup, etc.)
    const token = authHeader.slice(7);
    // Add your token validation logic here
  }

  return NextResponse.next();
}

export const config = {
  matcher: "/api/agent/:path*",
};

Conversation Persistence

Store conversation history so users can resume sessions:

See AI Voice Agents Handle Real Calls

Book a free demo or calculate how much you can save with AI voice automation.

Book a Demo ROI Calculator

// app/api/agent/route.ts
import { prisma } from "@/lib/prisma";

export async function POST(req: NextRequest) {
  const { message, conversationId } = await req.json();
  const userId = req.headers.get("x-user-id")!;

  // Load or create conversation
  let conversation = conversationId
    ? await prisma.conversation.findUnique({
        where: { id: conversationId, userId },
        include: { messages: { orderBy: { createdAt: "asc" } } },
      })
    : await prisma.conversation.create({
        data: { userId },
        include: { messages: true },
      });

  if (!conversation) {
    return NextResponse.json({ error: "Not found" }, { status: 404 });
  }

  // Build messages array from history
  const chatMessages = conversation.messages.map((m) => ({
    role: m.role as "user" | "assistant",
    content: m.content,
  }));
  chatMessages.push({ role: "user", content: message });

  // Call LLM
  const completion = await client.chat.completions.create({
    model: "gpt-4o",
    messages: [
      { role: "system", content: "You are a helpful assistant." },
      ...chatMessages,
    ],
  });

  const reply = completion.choices[0].message.content ?? "";

  // Persist both messages
  await prisma.message.createMany({
    data: [
      { conversationId: conversation.id, role: "user", content: message },
      { conversationId: conversation.id, role: "assistant", content: reply },
    ],
  });

  return NextResponse.json({
    conversationId: conversation.id,
    reply,
  });
}

Rate Limiting

Protect your agent endpoint from abuse:

// lib/rate-limit.ts
const rateLimitMap = new Map<string, { count: number; resetTime: number }>();

export function checkRateLimit(
  userId: string,
  maxRequests: number = 20,
  windowMs: number = 60_000
): boolean {
  const now = Date.now();
  const entry = rateLimitMap.get(userId);

  if (!entry || now > entry.resetTime) {
    rateLimitMap.set(userId, { count: 1, resetTime: now + windowMs });
    return true;
  }

  if (entry.count >= maxRequests) {
    return false;
  }

  entry.count++;
  return true;
}

Use it in your route handler:

if (!checkRateLimit(userId)) {
  return NextResponse.json(
    { error: "Rate limit exceeded. Try again in a minute." },
    { status: 429 }
  );
}

Edge Runtime Considerations

Next.js route handlers can run on the Edge Runtime for lower latency. However, agents often need Node.js APIs (database drivers, file system access). Use edge selectively:

// This route can run on edge — it only calls external APIs
export const runtime = "edge";

export async function POST(req: Request) {
  // OpenAI SDK works on edge
  const stream = await client.chat.completions.create({
    model: "gpt-4o",
    messages: await req.json().then((b) => b.messages),
    stream: true,
  });
  // ...stream response
}

For routes that need Prisma, Redis, or other Node.js-dependent libraries, keep the default Node.js runtime.

FAQ

Should I use API routes or Server Actions for AI agents?

Use API routes for agent interactions. Server Actions are designed for form mutations and do not support streaming responses. API route handlers give you full control over the response format, headers, and streaming behavior that AI agents require.

How do I handle long-running agent tasks that exceed the serverless timeout?

For tasks longer than the default timeout (60 seconds on Vercel Hobby, 300 seconds on Pro), use the maxDuration export in your route handler: export const maxDuration = 300;. For even longer tasks, offload to a background job queue (Inngest, Trigger.dev) and poll for results from the client.

Can I deploy a Next.js agent app to platforms other than Vercel?

Yes. Next.js deploys to any platform that supports Node.js: Railway, Fly.io, AWS (via SST or standalone mode), Docker containers, or a traditional VPS. The only features that are Vercel-specific are edge middleware optimizations and some caching behaviors.

#Nextjs #APIRoutes #FullStack #AIAgents #Streaming #EdgeRuntime #AgenticAI #LearnAI #AIEngineering

Building AI Agents with Next.js API Routes: Full-Stack Agent Applications

Why Next.js for AI Agent Applications

Basic Agent API Route

Streaming Responses from API Routes

Authentication Middleware

Conversation Persistence

Rate Limiting

Edge Runtime Considerations

FAQ

Should I use API routes or Server Actions for AI agents?

How do I handle long-running agent tasks that exceed the serverless timeout?

Can I deploy a Next.js agent app to platforms other than Vercel?

Try CallSphere AI Voice Agents

Related Articles

WebArena and Real-World Web Agent Benchmarks: How We Measure Browser Agent Performance

Taking Screenshots and Recording Videos with Playwright for AI Analysis

Playwright Selectors Deep Dive: CSS, XPath, Text, and Role-Based Element Finding