Building a Memory Layer for AI Agents: From Simple Lists to Vector Stores

Why Agents Need a Memory Layer

Without memory, every agent interaction starts from scratch. The agent cannot recall what it did five minutes ago, what the user prefers, or what tools returned previously. A memory layer gives agents the ability to store, retrieve, and reason over information across turns and sessions.

The right memory architecture depends on your requirements: how much data you store, how you query it, whether memory persists across restarts, and whether you need semantic search. Let us walk through four approaches in increasing order of sophistication.

Approach 1: In-Memory Lists

The simplest memory is a Python list. It is fast, requires no infrastructure, and works well for prototypes and single-session agents.

from dataclasses import dataclass, field
from datetime import datetime
from typing import List, Optional

@dataclass
class MemoryEntry:
    content: str
    category: str  # "fact", "preference", "task_result"
    timestamp: datetime = field(default_factory=datetime.utcnow)
    metadata: dict = field(default_factory=dict)

class InMemoryStore:
    def __init__(self):
        self._entries: List[MemoryEntry] = []

    def store(self, content: str, category: str, **metadata):
        entry = MemoryEntry(content=content, category=category, metadata=metadata)
        self._entries.append(entry)

    def search(self, keyword: str, category: Optional[str] = None) -> List[MemoryEntry]:
        results = []
        for entry in self._entries:
            if keyword.lower() in entry.content.lower():
                if category is None or entry.category == category:
                    results.append(entry)
        return results

    def get_recent(self, n: int = 10) -> List[MemoryEntry]:
        return self._entries[-n:]

# Usage
memory = InMemoryStore()
memory.store("User prefers dark mode", "preference")
memory.store("API returned 42 results for query X", "task_result")
results = memory.search("dark mode")

Limitations: All data is lost when the process ends. Keyword search is brittle — it misses semantic matches. It does not scale beyond a few thousand entries.

Approach 2: File-Based Persistence

Adding file persistence ensures memory survives restarts. JSON files work well for small datasets.

import json
from pathlib import Path

class FileMemoryStore:
    def __init__(self, path: str = "agent_memory.json"):
        self.path = Path(path)
        self._entries: List[dict] = []
        self._load()

    def _load(self):
        if self.path.exists():
            with open(self.path, "r") as f:
                self._entries = json.load(f)

    def _save(self):
        with open(self.path, "w") as f:
            json.dump(self._entries, f, indent=2, default=str)

    def store(self, content: str, category: str, **metadata):
        entry = {
            "content": content,
            "category": category,
            "timestamp": datetime.utcnow().isoformat(),
            "metadata": metadata,
        }
        self._entries.append(entry)
        self._save()

    def search(self, keyword: str) -> List[dict]:
        return [e for e in self._entries if keyword.lower() in e["content"].lower()]

File-based storage is ideal for single-user desktop agents or CLI tools. It falls apart with concurrent access or when you need complex queries.

See AI Voice Agents Handle Real Calls

Book a free demo or calculate how much you can save with AI voice automation.

Book a Demo ROI Calculator

Approach 3: Database-Backed Memory

A relational database like SQLite or PostgreSQL adds query flexibility, concurrency support, and scalability.

import sqlite3
from contextlib import contextmanager

class SQLiteMemoryStore:
    def __init__(self, db_path: str = "agent_memory.db"):
        self.db_path = db_path
        self._init_db()

    def _init_db(self):
        with self._connect() as conn:
            conn.execute("""
                CREATE TABLE IF NOT EXISTS memories (
                    id INTEGER PRIMARY KEY AUTOINCREMENT,
                    content TEXT NOT NULL,
                    category TEXT NOT NULL,
                    timestamp TEXT DEFAULT CURRENT_TIMESTAMP,
                    metadata TEXT DEFAULT '{}'
                )
            """)
            conn.execute(
                "CREATE INDEX IF NOT EXISTS idx_category ON memories(category)"
            )

    @contextmanager
    def _connect(self):
        conn = sqlite3.connect(self.db_path)
        conn.row_factory = sqlite3.Row
        try:
            yield conn
            conn.commit()
        finally:
            conn.close()

    def store(self, content: str, category: str, **metadata):
        with self._connect() as conn:
            conn.execute(
                "INSERT INTO memories (content, category, metadata) VALUES (?, ?, ?)",
                (content, category, json.dumps(metadata)),
            )

    def search(self, keyword: str, category: str = None, limit: int = 20):
        query = "SELECT * FROM memories WHERE content LIKE ?"
        params = [f"%{keyword}%"]
        if category:
            query += " AND category = ?"
            params.append(category)
        query += " ORDER BY timestamp DESC LIMIT ?"
        params.append(limit)
        with self._connect() as conn:
            return conn.execute(query, params).fetchall()

Approach 4: Vector Store Memory

When you need semantic search — finding memories by meaning rather than exact keywords — a vector store is essential. This approach embeds each memory as a high-dimensional vector and retrieves the closest matches.

import chromadb
from chromadb.config import Settings

class VectorMemoryStore:
    def __init__(self, collection_name: str = "agent_memory"):
        self.client = chromadb.PersistentClient(path="./chroma_data")
        self.collection = self.client.get_or_create_collection(
            name=collection_name,
            metadata={"hnsw:space": "cosine"},
        )
        self._counter = self.collection.count()

    def store(self, content: str, category: str, **metadata):
        self._counter += 1
        self.collection.add(
            documents=[content],
            metadatas=[{"category": category, **metadata}],
            ids=[f"mem_{self._counter}"],
        )

    def search(self, query: str, n_results: int = 5, category: str = None):
        where_filter = {"category": category} if category else None
        results = self.collection.query(
            query_texts=[query],
            n_results=n_results,
            where=where_filter,
        )
        return results["documents"][0] if results["documents"] else []

With a vector store, searching for "user interface theme preference" correctly retrieves a memory stored as "User prefers dark mode" even though none of the words match.

Comparison Table

Approach	Persistence	Semantic Search	Concurrency	Setup Cost
In-Memory List	None	No	No	Zero
File-Based	Restart-safe	No	No	Minimal
SQLite/Postgres	Full	No (FTS partial)	Yes	Low-Medium
Vector Store	Full	Yes	Yes	Medium

FAQ

When should I use a vector store instead of a database?

Use a vector store when your agent needs to retrieve memories by semantic similarity — for example, finding relevant past decisions when the user describes a situation in different words. If you only need exact-match or keyword lookups, a relational database is simpler and faster.

Can I combine a relational database with a vector store?

Yes, this is a common production pattern. Store structured data (timestamps, categories, metadata) in PostgreSQL and store the embedding vectors in a dedicated vector store like Chroma, Pinecone, or pgvector. Query both and merge results.

How much memory should an agent retain?

It depends on the use case. Customer support agents might keep the last 30 days. Research agents might keep everything. Implement a retention policy that expires old, low-relevance memories to keep storage costs manageable and retrieval quality high.

#AgentMemory #VectorStores #DatabaseDesign #Python #AgenticAI #LearnAI #AIEngineering

Building a Memory Layer for AI Agents: From Simple Lists to Vector Stores

Why Agents Need a Memory Layer

Approach 1: In-Memory Lists

Approach 2: File-Based Persistence

Approach 3: Database-Backed Memory

Approach 4: Vector Store Memory

Comparison Table

FAQ

When should I use a vector store instead of a database?

Can I combine a relational database with a vector store?

How much memory should an agent retain?

Try CallSphere AI Voice Agents

Related Articles

WebArena and Real-World Web Agent Benchmarks: How We Measure Browser Agent Performance

Taking Screenshots and Recording Videos with Playwright for AI Analysis

Playwright Selectors Deep Dive: CSS, XPath, Text, and Role-Based Element Finding