File Upload Handling in FastAPI for AI Agents: Processing Documents and Images

File Uploads for AI Agent Workloads

AI agents frequently need to process user-uploaded files: PDFs for research agents, images for vision analysis, CSV files for data agents, or code files for coding assistants. FastAPI handles file uploads through Starlette's UploadFile class, which provides async file reading, automatic temp file management, and streaming for large files.

The key challenge is not just receiving the file but validating it, storing it safely, and feeding it into your AI processing pipeline efficiently.

Basic File Upload Endpoint

Start with a simple upload endpoint that accepts a file alongside agent parameters:

from fastapi import UploadFile, File, Form, HTTPException

@router.post("/agent/upload")
async def upload_and_process(
    file: UploadFile = File(...),
    agent_type: str = Form(default="document"),
    instructions: str = Form(default="Summarize this document"),
):
    content = await file.read()

    if not content:
        raise HTTPException(400, "Empty file")

    result = await document_agent.process(
        content=content,
        filename=file.filename,
        instructions=instructions,
    )

    return {
        "filename": file.filename,
        "size_bytes": len(content),
        "result": result,
    }

File Type and Size Validation

Never trust client-provided file types. Validate both the extension and the actual file content:

import magic  # python-magic library

ALLOWED_TYPES = {
    "application/pdf": [".pdf"],
    "text/plain": [".txt", ".md", ".csv"],
    "text/csv": [".csv"],
    "image/png": [".png"],
    "image/jpeg": [".jpg", ".jpeg"],
}

MAX_FILE_SIZE = 20 * 1024 * 1024  # 20 MB

async def validate_upload(file: UploadFile) -> bytes:
    # Read content
    content = await file.read()

    # Check size
    if len(content) > MAX_FILE_SIZE:
        raise HTTPException(
            413,
            f"File too large. Maximum size: "
            f"{MAX_FILE_SIZE // (1024*1024)}MB",
        )

    # Check actual MIME type using file content
    detected_type = magic.from_buffer(content, mime=True)

    if detected_type not in ALLOWED_TYPES:
        raise HTTPException(
            415,
            f"Unsupported file type: {detected_type}. "
            f"Allowed: {', '.join(ALLOWED_TYPES.keys())}",
        )

    # Verify extension matches content
    ext = Path(file.filename).suffix.lower()
    allowed_exts = ALLOWED_TYPES[detected_type]
    if ext not in allowed_exts:
        raise HTTPException(
            400,
            f"Extension {ext} does not match "
            f"detected type {detected_type}",
        )

    # Reset file position for downstream processing
    await file.seek(0)
    return content

The python-magic library reads file headers to determine the actual type, preventing renamed malicious files from bypassing extension checks.

Multiple File Upload

AI agents that compare documents or process batches need multi-file upload:

See AI Voice Agents Handle Real Calls

Book a free demo or calculate how much you can save with AI voice automation.

Book a Demo ROI Calculator

from typing import List

@router.post("/agent/batch-upload")
async def batch_upload(
    files: List[UploadFile] = File(...),
    instructions: str = Form(default="Compare these documents"),
):
    if len(files) > 10:
        raise HTTPException(400, "Maximum 10 files per batch")

    processed_files = []
    total_size = 0

    for file in files:
        content = await validate_upload(file)
        total_size += len(content)

        if total_size > 50 * 1024 * 1024:  # 50MB total limit
            raise HTTPException(
                413, "Total upload size exceeds 50MB limit"
            )

        processed_files.append({
            "filename": file.filename,
            "content": content,
            "size": len(content),
        })

    result = await document_agent.process_batch(
        files=processed_files,
        instructions=instructions,
    )
    return result

Storing Uploaded Files

For files that need to persist beyond the request, save them to disk or object storage:

import aiofiles
from pathlib import Path

UPLOAD_DIR = Path("uploads")
UPLOAD_DIR.mkdir(exist_ok=True)

async def save_upload(
    file: UploadFile, subdirectory: str = ""
) -> Path:
    # Generate safe filename
    safe_name = f"{uuid.uuid4()}{Path(file.filename).suffix}"
    save_dir = UPLOAD_DIR / subdirectory
    save_dir.mkdir(parents=True, exist_ok=True)
    file_path = save_dir / safe_name

    async with aiofiles.open(file_path, "wb") as f:
        while chunk := await file.read(8192):
            await f.write(chunk)

    return file_path

@router.post("/agent/upload-and-store")
async def upload_store_process(
    file: UploadFile = File(...),
    db: AsyncSession = Depends(get_db),
):
    await validate_upload(file)
    await file.seek(0)

    file_path = await save_upload(file, subdirectory="documents")

    # Record in database
    doc = Document(
        filename=file.filename,
        stored_path=str(file_path),
        size_bytes=file_path.stat().st_size,
        uploaded_at=datetime.utcnow(),
    )
    db.add(doc)
    await db.flush()

    return {"document_id": str(doc.id), "filename": file.filename}

Reading the file in 8KB chunks with aiofiles prevents loading the entire file into memory at once, which matters for large uploads.

Async Document Processing Pipeline

Combine file upload with background processing for a complete document agent workflow:

@router.post("/agent/analyze-document", status_code=202)
async def analyze_document(
    file: UploadFile = File(...),
    analysis_type: str = Form(default="summary"),
    background_tasks: BackgroundTasks = None,
    db: AsyncSession = Depends(get_db),
):
    content = await validate_upload(file)
    await file.seek(0)

    # Save file
    file_path = await save_upload(file, "analysis")

    # Create task record
    task = AnalysisTask(
        filename=file.filename,
        stored_path=str(file_path),
        analysis_type=analysis_type,
        status="pending",
    )
    db.add(task)
    await db.flush()
    task_id = str(task.id)

    # Process in background
    background_tasks.add_task(
        run_document_analysis,
        task_id=task_id,
        file_path=str(file_path),
        analysis_type=analysis_type,
    )

    return {"task_id": task_id, "status": "pending"}

FAQ

How do I handle very large file uploads without running out of memory?

Use chunked reading with await file.read(chunk_size) in a loop instead of await file.read() which loads the entire file into memory. For files over 100MB, consider a chunked upload protocol where the client uploads in parts, or use presigned URLs to upload directly to object storage like S3, then pass the object key to your API for processing.

Can I accept both a file and a JSON body in the same request?

FastAPI does not allow combining UploadFile with a JSON request body in the same endpoint because multipart form data and JSON bodies use different content types. Use Form() parameters alongside File(), or accept the JSON as a string Form field and parse it with Pydantic manually. Another approach is a two-step flow: upload the file first and get back a file ID, then send a JSON request referencing that file ID.

How do I extract text from uploaded PDFs for the AI agent?

Use libraries like PyMuPDF (fitz) or pdfplumber for text extraction. Read the uploaded bytes, open the PDF, iterate through pages, and extract text. For scanned PDFs without embedded text, you need OCR with a library like pytesseract. Process PDF extraction in a background task because it can be CPU-intensive for large documents with many pages.

#FastAPI #FileUpload #DocumentProcessing #AIAgents #Python #AgenticAI #LearnAI #AIEngineering

File Upload Handling in FastAPI for AI Agents: Processing Documents and Images

File Uploads for AI Agent Workloads

Basic File Upload Endpoint

File Type and Size Validation

Multiple File Upload

Storing Uploaded Files

Async Document Processing Pipeline

FAQ

How do I handle very large file uploads without running out of memory?

Can I accept both a file and a JSON body in the same request?

How do I extract text from uploaded PDFs for the AI agent?

Try CallSphere AI Voice Agents

Related Articles

WebArena and Real-World Web Agent Benchmarks: How We Measure Browser Agent Performance

Taking Screenshots and Recording Videos with Playwright for AI Analysis

Playwright Selectors Deep Dive: CSS, XPath, Text, and Role-Based Element Finding