Building a Memory Layer for AI Agents: From Simple Lists to Vector Stores
Explore four approaches to building agent memory — in-memory lists, file-based storage, relational databases, and vector stores — with practical Python implementations and guidance on when to use each.
Why Agents Need a Memory Layer
Without memory, every agent interaction starts from scratch. The agent cannot recall what it did five minutes ago, what the user prefers, or what tools returned previously. A memory layer gives agents the ability to store, retrieve, and reason over information across turns and sessions.
The right memory architecture depends on your requirements: how much data you store, how you query it, whether memory persists across restarts, and whether you need semantic search. Let us walk through four approaches in increasing order of sophistication.
Approach 1: In-Memory Lists
The simplest memory is a Python list. It is fast, requires no infrastructure, and works well for prototypes and single-session agents.
from dataclasses import dataclass, field
from datetime import datetime
from typing import List, Optional
@dataclass
class MemoryEntry:
content: str
category: str # "fact", "preference", "task_result"
timestamp: datetime = field(default_factory=datetime.utcnow)
metadata: dict = field(default_factory=dict)
class InMemoryStore:
def __init__(self):
self._entries: List[MemoryEntry] = []
def store(self, content: str, category: str, **metadata):
entry = MemoryEntry(content=content, category=category, metadata=metadata)
self._entries.append(entry)
def search(self, keyword: str, category: Optional[str] = None) -> List[MemoryEntry]:
results = []
for entry in self._entries:
if keyword.lower() in entry.content.lower():
if category is None or entry.category == category:
results.append(entry)
return results
def get_recent(self, n: int = 10) -> List[MemoryEntry]:
return self._entries[-n:]
# Usage
memory = InMemoryStore()
memory.store("User prefers dark mode", "preference")
memory.store("API returned 42 results for query X", "task_result")
results = memory.search("dark mode")
Limitations: All data is lost when the process ends. Keyword search is brittle — it misses semantic matches. It does not scale beyond a few thousand entries.
Approach 2: File-Based Persistence
Adding file persistence ensures memory survives restarts. JSON files work well for small datasets.
import json
from pathlib import Path
class FileMemoryStore:
def __init__(self, path: str = "agent_memory.json"):
self.path = Path(path)
self._entries: List[dict] = []
self._load()
def _load(self):
if self.path.exists():
with open(self.path, "r") as f:
self._entries = json.load(f)
def _save(self):
with open(self.path, "w") as f:
json.dump(self._entries, f, indent=2, default=str)
def store(self, content: str, category: str, **metadata):
entry = {
"content": content,
"category": category,
"timestamp": datetime.utcnow().isoformat(),
"metadata": metadata,
}
self._entries.append(entry)
self._save()
def search(self, keyword: str) -> List[dict]:
return [e for e in self._entries if keyword.lower() in e["content"].lower()]
File-based storage is ideal for single-user desktop agents or CLI tools. It falls apart with concurrent access or when you need complex queries.
See AI Voice Agents Handle Real Calls
Book a free demo or calculate how much you can save with AI voice automation.
Approach 3: Database-Backed Memory
A relational database like SQLite or PostgreSQL adds query flexibility, concurrency support, and scalability.
import sqlite3
from contextlib import contextmanager
class SQLiteMemoryStore:
def __init__(self, db_path: str = "agent_memory.db"):
self.db_path = db_path
self._init_db()
def _init_db(self):
with self._connect() as conn:
conn.execute("""
CREATE TABLE IF NOT EXISTS memories (
id INTEGER PRIMARY KEY AUTOINCREMENT,
content TEXT NOT NULL,
category TEXT NOT NULL,
timestamp TEXT DEFAULT CURRENT_TIMESTAMP,
metadata TEXT DEFAULT '{}'
)
""")
conn.execute(
"CREATE INDEX IF NOT EXISTS idx_category ON memories(category)"
)
@contextmanager
def _connect(self):
conn = sqlite3.connect(self.db_path)
conn.row_factory = sqlite3.Row
try:
yield conn
conn.commit()
finally:
conn.close()
def store(self, content: str, category: str, **metadata):
with self._connect() as conn:
conn.execute(
"INSERT INTO memories (content, category, metadata) VALUES (?, ?, ?)",
(content, category, json.dumps(metadata)),
)
def search(self, keyword: str, category: str = None, limit: int = 20):
query = "SELECT * FROM memories WHERE content LIKE ?"
params = [f"%{keyword}%"]
if category:
query += " AND category = ?"
params.append(category)
query += " ORDER BY timestamp DESC LIMIT ?"
params.append(limit)
with self._connect() as conn:
return conn.execute(query, params).fetchall()
Approach 4: Vector Store Memory
When you need semantic search — finding memories by meaning rather than exact keywords — a vector store is essential. This approach embeds each memory as a high-dimensional vector and retrieves the closest matches.
import chromadb
from chromadb.config import Settings
class VectorMemoryStore:
def __init__(self, collection_name: str = "agent_memory"):
self.client = chromadb.PersistentClient(path="./chroma_data")
self.collection = self.client.get_or_create_collection(
name=collection_name,
metadata={"hnsw:space": "cosine"},
)
self._counter = self.collection.count()
def store(self, content: str, category: str, **metadata):
self._counter += 1
self.collection.add(
documents=[content],
metadatas=[{"category": category, **metadata}],
ids=[f"mem_{self._counter}"],
)
def search(self, query: str, n_results: int = 5, category: str = None):
where_filter = {"category": category} if category else None
results = self.collection.query(
query_texts=[query],
n_results=n_results,
where=where_filter,
)
return results["documents"][0] if results["documents"] else []
With a vector store, searching for "user interface theme preference" correctly retrieves a memory stored as "User prefers dark mode" even though none of the words match.
Comparison Table
| Approach | Persistence | Semantic Search | Concurrency | Setup Cost |
|---|---|---|---|---|
| In-Memory List | None | No | No | Zero |
| File-Based | Restart-safe | No | No | Minimal |
| SQLite/Postgres | Full | No (FTS partial) | Yes | Low-Medium |
| Vector Store | Full | Yes | Yes | Medium |
FAQ
When should I use a vector store instead of a database?
Use a vector store when your agent needs to retrieve memories by semantic similarity — for example, finding relevant past decisions when the user describes a situation in different words. If you only need exact-match or keyword lookups, a relational database is simpler and faster.
Can I combine a relational database with a vector store?
Yes, this is a common production pattern. Store structured data (timestamps, categories, metadata) in PostgreSQL and store the embedding vectors in a dedicated vector store like Chroma, Pinecone, or pgvector. Query both and merge results.
How much memory should an agent retain?
It depends on the use case. Customer support agents might keep the last 30 days. Research agents might keep everything. Implement a retention policy that expires old, low-relevance memories to keep storage costs manageable and retrieval quality high.
#AgentMemory #VectorStores #DatabaseDesign #Python #AgenticAI #LearnAI #AIEngineering
CallSphere Team
Expert insights on AI voice agents and customer communication automation.
Try CallSphere AI Voice Agents
See how AI voice agents work for your industry. Live demo available -- no signup required.