Skip to content
Learn Agentic AI
Learn Agentic AI11 min read0 views

FastAPI Dependency Injection for AI Agents: Managing LLM Clients and Sessions

Master FastAPI's Depends system to inject LLM clients, database sessions, and agent configurations into your AI agent endpoints. Covers scoped dependencies, sub-dependencies, and testing with overrides.

Why Dependency Injection Matters for AI Agents

AI agent backends have several dependencies that need careful lifecycle management: LLM clients that should be shared across requests, database sessions that must be scoped to a single request and properly closed, and agent configurations that vary by environment. FastAPI's Depends system solves all of these by letting you declare what each endpoint needs, while the framework handles instantiation, sharing, and cleanup.

Without dependency injection, you end up with global variables, manual resource cleanup in try/finally blocks, and test suites that cannot swap real LLM calls for mocks. With Depends, your endpoints declare their dependencies explicitly, making the code readable, testable, and maintainable.

Database Session Dependencies

The most common dependency pattern is a database session that is created per request and closed afterward:

flowchart TD
    START["FastAPI Dependency Injection for AI Agents: Manag…"] --> A
    A["Why Dependency Injection Matters for AI…"]
    A --> B
    B["Database Session Dependencies"]
    B --> C
    C["LLM Client Injection"]
    C --> D
    D["Agent Factory Pattern"]
    D --> E
    E["Testing with Dependency Overrides"]
    E --> F
    F["FAQ"]
    F --> DONE["Key Takeaways"]
    style START fill:#4f46e5,stroke:#4338ca,color:#fff
    style DONE fill:#059669,stroke:#047857,color:#fff
from sqlalchemy.ext.asyncio import (
    create_async_engine,
    async_sessionmaker,
    AsyncSession,
)
from typing import AsyncGenerator

engine = create_async_engine(
    "postgresql+asyncpg://user:pass@localhost/agents_db",
    pool_size=20,
    max_overflow=10,
)
async_session_factory = async_sessionmaker(
    engine, expire_on_commit=False
)

async def get_db() -> AsyncGenerator[AsyncSession, None]:
    async with async_session_factory() as session:
        try:
            yield session
            await session.commit()
        except Exception:
            await session.rollback()
            raise

# Usage in an endpoint
@router.post("/conversations")
async def create_conversation(
    request: CreateConversationRequest,
    db: AsyncSession = Depends(get_db),
):
    conversation = Conversation(
        user_id=request.user_id,
        agent_type=request.agent_type,
    )
    db.add(conversation)
    # commit happens automatically when the dependency closes
    return {"id": str(conversation.id)}

The yield in get_db separates setup from cleanup. Everything before yield runs before the endpoint. Everything after runs when the response is complete, even if the endpoint raises an exception.

LLM Client Injection

LLM clients should be created once and shared across all requests. Combine lifespan events with a dependency that retrieves the shared client:

from openai import AsyncOpenAI
from fastapi import Request

# Created once in lifespan
async def get_llm_client(request: Request) -> AsyncOpenAI:
    return request.app.state.llm_client

# Higher-level service dependency
class LLMService:
    def __init__(self, client: AsyncOpenAI, model: str):
        self.client = client
        self.model = model

    async def generate(self, messages: list[dict]) -> str:
        response = await self.client.chat.completions.create(
            model=self.model,
            messages=messages,
        )
        return response.choices[0].message.content

async def get_llm_service(
    client: AsyncOpenAI = Depends(get_llm_client),
    settings: Settings = Depends(get_settings),
) -> LLMService:
    return LLMService(
        client=client,
        model=settings.openai_model,
    )

Notice how get_llm_service depends on both get_llm_client and get_settings. FastAPI resolves this dependency chain automatically, building the LLMService with all the pieces it needs.

See AI Voice Agents Handle Real Calls

Book a free demo or calculate how much you can save with AI voice automation.

Agent Factory Pattern

When you have multiple specialized agents, use a factory dependency that returns the right agent based on the request:

from enum import Enum

class AgentType(str, Enum):
    RESEARCH = "research"
    SUPPORT = "support"
    CODING = "coding"

class AgentFactory:
    def __init__(self, llm_service: LLMService, db: AsyncSession):
        self.llm_service = llm_service
        self.db = db

    def create(self, agent_type: AgentType) -> BaseAgent:
        agents = {
            AgentType.RESEARCH: ResearchAgent,
            AgentType.SUPPORT: SupportAgent,
            AgentType.CODING: CodingAgent,
        }
        agent_class = agents.get(agent_type)
        if not agent_class:
            raise ValueError(f"Unknown agent type: {agent_type}")

        return agent_class(
            llm_service=self.llm_service,
            db=self.db,
        )

async def get_agent_factory(
    llm_service: LLMService = Depends(get_llm_service),
    db: AsyncSession = Depends(get_db),
) -> AgentFactory:
    return AgentFactory(llm_service=llm_service, db=db)

@router.post("/agents/{agent_type}/chat")
async def chat(
    agent_type: AgentType,
    request: ChatRequest,
    factory: AgentFactory = Depends(get_agent_factory),
):
    agent = factory.create(agent_type)
    response = await agent.process(request.message)
    return {"response": response}

Testing with Dependency Overrides

The real power of dependency injection shines in testing. Override any dependency to swap real services for mocks:

import pytest
from httpx import AsyncClient, ASGITransport

class MockLLMService:
    async def generate(self, messages):
        return "This is a mock response"

    async def stream_generate(self, message):
        for word in ["Hello", " from", " mock"]:
            yield word

@pytest.fixture
def app_with_mocks():
    app.dependency_overrides[get_llm_service] = (
        lambda: MockLLMService()
    )
    app.dependency_overrides[get_db] = get_test_db
    yield app
    app.dependency_overrides.clear()

@pytest.mark.anyio
async def test_chat_endpoint(app_with_mocks):
    transport = ASGITransport(app=app_with_mocks)
    async with AsyncClient(
        transport=transport, base_url="http://test"
    ) as client:
        response = await client.post(
            "/agents/research/chat",
            json={"message": "test query"},
        )
        assert response.status_code == 200
        assert "mock response" in response.json()["response"]

No real LLM calls, no real database connections. The test runs in milliseconds and is completely deterministic.

FAQ

Can I use class-based dependencies in FastAPI?

Yes. Define a class with a __call__ method and use it with Depends. FastAPI will instantiate the class and call its __call__ method. For async cleanup, implement __aenter__ and __aexit__ and use the class as an async context manager in a generator dependency. However, function-based dependencies with yield are more common and usually simpler to understand.

How do I share a single dependency instance across multiple endpoints in the same request?

FastAPI automatically caches dependency results within a single request. If two endpoints in the same request both depend on get_db, they get the same session instance. This is the default behavior and requires no configuration. If you explicitly want a new instance each time, use Depends(get_db, use_cache=False).

What happens if a dependency raises an exception?

If a dependency raises an exception before the yield, the endpoint never runs, and FastAPI returns an appropriate error response. If an exception occurs after yield during cleanup, it is logged but does not affect the response already sent to the client. Always put critical cleanup logic in a finally block inside your generator dependency to ensure it runs regardless of exceptions.


#FastAPI #DependencyInjection #AIAgents #Python #Testing #AgenticAI #LearnAI #AIEngineering

Share
C

Written by

CallSphere Team

Expert insights on AI voice agents and customer communication automation.

Try CallSphere AI Voice Agents

See how AI voice agents work for your industry. Live demo available -- no signup required.