API Security Headers for AI Agent Services: CORS, CSP, and Rate Limit Headers
Configure essential security headers for AI agent APIs including CORS policies, Content Security Policy, rate limit communication headers, and other protective headers with FastAPI middleware examples.
Security Headers: Your API's First Line of Defense
HTTP security headers protect your AI agent API from common attack vectors: cross-origin abuse, content injection, information leakage, and protocol downgrade attacks. Unlike authentication and authorization (which verify who is making the request), security headers define how the request and response should be handled by browsers, proxies, and clients.
For AI agent APIs, security headers serve a dual purpose. They protect browser-based agent interfaces from XSS and clickjacking, and they communicate rate limiting information so agents can self-throttle rather than hitting walls.
CORS Configuration
Cross-Origin Resource Sharing controls which domains can call your API from a browser. For AI agent APIs, you need to balance accessibility (agents running on various domains) with security (preventing unauthorized cross-origin requests).
from fastapi import FastAPI
from fastapi.middleware.cors import CORSMiddleware
app = FastAPI()
# Production CORS: restrict to known origins
app.add_middleware(
CORSMiddleware,
allow_origins=[
"https://app.example.com",
"https://dashboard.example.com",
"https://playground.example.com",
],
allow_credentials=True,
allow_methods=["GET", "POST", "PUT", "PATCH", "DELETE"],
allow_headers=[
"Authorization",
"Content-Type",
"X-API-Key",
"X-Request-ID",
"Idempotency-Key",
],
expose_headers=[
"X-Request-ID",
"X-RateLimit-Limit",
"X-RateLimit-Remaining",
"X-RateLimit-Reset",
"Retry-After",
],
max_age=3600,
)
The expose_headers configuration is often overlooked. By default, browsers only expose a handful of response headers to JavaScript. Without listing your rate limit headers here, browser-based agents cannot read them, even though server-to-server agents can.
Rate Limit Headers
Rate limiting is essential for AI agent APIs where a single agent can generate hundreds of requests per minute. Communicate limits clearly using standardized headers so agents can self-regulate.
See AI Voice Agents Handle Real Calls
Book a free demo or calculate how much you can save with AI voice automation.
from starlette.middleware.base import BaseHTTPMiddleware
from fastapi import Request
from fastapi.responses import JSONResponse
import time
class RateLimitMiddleware(BaseHTTPMiddleware):
def __init__(self, app, requests_per_minute: int = 60):
super().__init__(app)
self.rpm = requests_per_minute
# In production, use Redis with sliding window
self.buckets: dict[str, dict] = {}
async def dispatch(self, request: Request, call_next):
client_id = self._get_client_id(request)
now = time.time()
bucket = self.buckets.get(client_id, {
"count": 0, "reset_at": now + 60,
})
if now > bucket["reset_at"]:
bucket = {"count": 0, "reset_at": now + 60}
bucket["count"] += 1
self.buckets[client_id] = bucket
remaining = max(0, self.rpm - bucket["count"])
reset_at = int(bucket["reset_at"])
rate_headers = {
"X-RateLimit-Limit": str(self.rpm),
"X-RateLimit-Remaining": str(remaining),
"X-RateLimit-Reset": str(reset_at),
}
if bucket["count"] > self.rpm:
retry_after = int(bucket["reset_at"] - now)
return JSONResponse(
status_code=429,
content={
"type": "https://api.example.com/errors/rate-limit",
"title": "Rate Limit Exceeded",
"detail": f"Limit: {self.rpm} requests/minute",
"retryable": True,
"retry_after_seconds": retry_after,
},
headers={
**rate_headers,
"Retry-After": str(retry_after),
},
)
response = await call_next(request)
for key, value in rate_headers.items():
response.headers[key] = value
return response
def _get_client_id(self, request: Request) -> str:
api_key = request.headers.get("X-API-Key", "")
if api_key:
return f"key:{api_key}"
forwarded = request.headers.get("X-Forwarded-For", "")
return f"ip:{forwarded or request.client.host}"
app.add_middleware(RateLimitMiddleware, requests_per_minute=100)
Comprehensive Security Headers Middleware
Beyond CORS and rate limiting, add headers that prevent common web attacks and information leakage.
class SecurityHeadersMiddleware(BaseHTTPMiddleware):
async def dispatch(self, request: Request, call_next):
response = await call_next(request)
# Prevent MIME type sniffing
response.headers["X-Content-Type-Options"] = "nosniff"
# Prevent clickjacking
response.headers["X-Frame-Options"] = "DENY"
# Control referrer information
response.headers["Referrer-Policy"] = "strict-origin-when-cross-origin"
# Force HTTPS
response.headers["Strict-Transport-Security"] = (
"max-age=31536000; includeSubDomains; preload"
)
# Remove server identification
response.headers.pop("Server", None)
# Permissions Policy - disable unused browser features
response.headers["Permissions-Policy"] = (
"camera=(), microphone=(), geolocation=(), "
"payment=(), usb=(), magnetometer=()"
)
# Content Security Policy for API responses
if "text/html" in response.headers.get("content-type", ""):
response.headers["Content-Security-Policy"] = (
"default-src 'none'; "
"script-src 'self'; "
"style-src 'self' 'unsafe-inline'; "
"img-src 'self' data:; "
"font-src 'self'; "
"connect-src 'self'"
)
return response
app.add_middleware(SecurityHeadersMiddleware)
Request ID Tracking
Assign a unique ID to every request for distributed tracing. If the client sends an X-Request-ID header, propagate it; otherwise, generate one. This is invaluable for debugging agent interactions across multiple services.
import uuid
class RequestIDMiddleware(BaseHTTPMiddleware):
async def dispatch(self, request: Request, call_next):
request_id = request.headers.get(
"X-Request-ID", str(uuid.uuid4())
)
request.state.request_id = request_id
response = await call_next(request)
response.headers["X-Request-ID"] = request_id
return response
app.add_middleware(RequestIDMiddleware)
FAQ
Should I use wildcard CORS (*) for my AI agent API?
Never use wildcard CORS in production for APIs that use cookies or bearer tokens. A wildcard origin with allow_credentials=True is actually rejected by browsers for security reasons. For public APIs that use API keys in headers rather than cookies, a wildcard origin is acceptable but still not recommended. List specific allowed origins and use environment variables to configure them per deployment environment.
What is the difference between X-RateLimit headers and the standard Retry-After header?
They serve complementary purposes. The X-RateLimit-* headers are informational and sent on every response, telling the client their current quota status (limit, remaining, reset time). The Retry-After header is directive and only sent with 429 or 503 responses, telling the client exactly how many seconds to wait before retrying. Always include both: the rate limit headers for proactive throttling and Retry-After for reactive recovery.
Should I apply rate limiting per API key or per IP address?
Apply rate limiting per API key for authenticated requests and per IP for unauthenticated requests. API key-based limiting is more accurate since multiple users may share an IP (corporate NATs, VPNs). Consider tiered rate limits based on the subscription plan — a free tier might get 10 requests per minute while an enterprise tier gets 1000. Always communicate the current tier's limits in the rate limit response headers.
#APISecurity #CORS #RateLimiting #HTTPHeaders #FastAPI #AgenticAI #LearnAI #AIEngineering
CallSphere Team
Expert insights on AI voice agents and customer communication automation.
Try CallSphere AI Voice Agents
See how AI voice agents work for your industry. Live demo available -- no signup required.