CrewAI Memory: Short-Term, Long-Term, and Entity Memory for Persistent Crews

The Problem Memory Solves

By default, each CrewAI kickoff is stateless. Agents have no recollection of previous runs, previous tasks within the same run (beyond explicit context), or any entities they have encountered before. This is fine for one-shot tasks, but many real applications need agents that accumulate knowledge over time.

CrewAI's memory system addresses this by providing three distinct memory types, each serving a different purpose. When combined, they give agents a layered recall system that mimics how humans use working memory, long-term memory, and entity recognition.

Enabling Memory

Memory is disabled by default. Enable it at the crew level:

from crewai import Crew, Process

crew = Crew(
    agents=[researcher, analyst],
    tasks=[research_task, analysis_task],
    process=Process.sequential,
    memory=True,
    verbose=True,
)

Setting memory=True activates all three memory types with default settings. CrewAI uses a local embedding model and file-based storage out of the box, so no external services are required.

Short-Term Memory

Short-term memory stores context from the current crew execution. It allows agents to reference information generated by other agents during the same run without explicit context chaining:

from crewai.memory.short_term import ShortTermMemory
from crewai.memory.storage import RAGStorage

crew = Crew(
    agents=[researcher, analyst, writer],
    tasks=[research_task, analysis_task, writing_task],
    memory=True,
    short_term_memory=ShortTermMemory(
        storage=RAGStorage(type="short_term"),
    ),
)

During execution, each agent's output is automatically embedded and stored. When a downstream agent starts working, the memory system retrieves relevant snippets from earlier tasks. This is especially valuable in hierarchical processes where task order is not predetermined.

Short-term memory resets between kickoff() calls. It exists only for the duration of a single crew execution.

Long-Term Memory

Long-term memory persists across multiple crew runs. It stores task results and agent decisions in a database that survives process restarts:

from crewai.memory.long_term import LongTermMemory
from crewai.memory.storage import RAGStorage

crew = Crew(
    agents=[researcher, analyst],
    tasks=[research_task, analysis_task],
    memory=True,
    long_term_memory=LongTermMemory(
        storage=RAGStorage(
            type="long_term",
            path="./crew_memory/long_term",
        ),
    ),
)

# First run — crew learns
result1 = crew.kickoff(inputs={"topic": "quantum computing"})

# Second run — crew recalls patterns from the first run
result2 = crew.kickoff(inputs={"topic": "quantum networking"})

On the second run, when agents encounter concepts related to quantum computing, the long-term memory surfaces relevant findings from the first run. This creates a feedback loop where the crew genuinely improves over time.

See AI Voice Agents Handle Real Calls

Book a free demo or calculate how much you can save with AI voice automation.

Book a Demo ROI Calculator

The default storage backend uses SQLite files in your project directory. For production, you can configure external storage.

Entity Memory

Entity memory tracks specific people, organizations, concepts, and relationships that agents encounter. It builds a knowledge graph of entities and their attributes:

from crewai.memory.entity import EntityMemory
from crewai.memory.storage import RAGStorage

crew = Crew(
    agents=[researcher, analyst],
    tasks=[research_task, analysis_task],
    memory=True,
    entity_memory=EntityMemory(
        storage=RAGStorage(
            type="entities",
            path="./crew_memory/entities",
        ),
    ),
)

When the researcher discovers that "Anthropic released Claude 3.5 Sonnet in 2024," the entity memory stores "Anthropic" as an organization, "Claude 3.5 Sonnet" as a product, and their relationship. On subsequent runs, agents can retrieve this entity knowledge when relevant topics arise.

Configuring Embeddings

Memory relies on embeddings to store and retrieve information. By default, CrewAI uses a local embedding model. You can switch to OpenAI embeddings for better quality:

from crewai import Crew

crew = Crew(
    agents=[researcher, analyst],
    tasks=[research_task, analysis_task],
    memory=True,
    embedder={
        "provider": "openai",
        "config": {
            "model": "text-embedding-3-small",
        },
    },
)

For fully offline operation, use a local model:

crew = Crew(
    agents=[researcher, analyst],
    tasks=[research_task],
    memory=True,
    embedder={
        "provider": "huggingface",
        "config": {
            "model": "sentence-transformers/all-MiniLM-L6-v2",
        },
    },
)

The embedding provider affects memory retrieval quality. OpenAI embeddings generally produce better recall but add API costs and latency. Local models are faster and free but may miss subtle semantic connections.

Memory Retrieval in Practice

When an agent starts working on a task, the memory system automatically queries all active memory types with the task description and returns relevant context. You do not write retrieval code — it is handled by the framework.

You can see memory in action by enabling verbose mode:

crew = Crew(
    agents=[researcher],
    tasks=[task],
    memory=True,
    verbose=True,
)

The verbose output shows when memory is queried, what results are returned, and how the agent incorporates recalled information into its reasoning.

FAQ

Does memory increase token usage?

Yes. Retrieved memories are injected into the agent's prompt, which adds tokens to every LLM call. The increase is typically 200 to 500 tokens per memory retrieval. For most applications, this cost is justified by the improved output quality and consistency.

Can I inspect or clear stored memories?

Yes. Memory files are stored in your project directory (default: ./.crewai/). You can inspect the SQLite databases directly, or clear memory by deleting the storage directory. For programmatic access, use the memory storage objects directly to query or delete specific entries.

Should I enable all three memory types or pick selectively?

Start with just memory=True and see if the default combination works. If your agents only run once, short-term memory alone is sufficient. Enable long-term memory when you run the same crew repeatedly and want it to improve. Enable entity memory when your domain involves tracking specific people, products, or organizations across runs.

#CrewAI #Memory #RAG #Embeddings #Persistence #AgenticAI #LearnAI #AIEngineering

CrewAI Memory: Short-Term, Long-Term, and Entity Memory for Persistent Crews

The Problem Memory Solves

Enabling Memory

Short-Term Memory

Long-Term Memory

Entity Memory

Configuring Embeddings

Memory Retrieval in Practice

FAQ

Does memory increase token usage?

Can I inspect or clear stored memories?

Should I enable all three memory types or pick selectively?

Try CallSphere AI Voice Agents

Related Articles

WebArena and Real-World Web Agent Benchmarks: How We Measure Browser Agent Performance

Taking Screenshots and Recording Videos with Playwright for AI Analysis

Playwright Selectors Deep Dive: CSS, XPath, Text, and Role-Based Element Finding