Episodic Memory in AI Agents: Learning from Past Interactions and Outcomes
Discover how to implement episodic memory for AI agents — storing complete interaction episodes, retrieving similar past experiences, and creating feedback loops that improve agent performance over time.
What Is Episodic Memory?
While semantic memory stores facts and knowledge, episodic memory records complete experiences — the full story of what happened, what actions were taken, and what the outcome was. In human cognition, episodic memory is what lets you recall not just that Paris is the capital of France, but the specific trip you took there and what you learned along the way.
For AI agents, episodic memory means storing entire interaction episodes — including the task, the sequence of actions, tool calls, intermediate results, and the final outcome. When the agent encounters a similar situation later, it can retrieve relevant past episodes and use them to make better decisions.
Defining an Episode
An episode captures the full arc of a task execution: the initial request, every step the agent took, and the result.
from dataclasses import dataclass, field
from datetime import datetime
from typing import List, Optional
from enum import Enum
class Outcome(Enum):
SUCCESS = "success"
FAILURE = "failure"
PARTIAL = "partial"
@dataclass
class ActionStep:
action: str # "tool_call", "llm_response", "user_input"
detail: str # what specifically happened
timestamp: datetime = field(default_factory=datetime.utcnow)
result: Optional[str] = None
error: Optional[str] = None
@dataclass
class Episode:
task_description: str
steps: List[ActionStep] = field(default_factory=list)
outcome: Outcome = Outcome.PARTIAL
outcome_detail: str = ""
lessons_learned: str = ""
embedding: Optional[List[float]] = None
created_at: datetime = field(default_factory=datetime.utcnow)
tags: List[str] = field(default_factory=list)
def add_step(self, action: str, detail: str, result: str = None, error: str = None):
self.steps.append(ActionStep(
action=action, detail=detail, result=result, error=error
))
def complete(self, outcome: Outcome, detail: str = "", lessons: str = ""):
self.outcome = outcome
self.outcome_detail = detail
self.lessons_learned = lessons
Building an Episodic Memory Store
The store manages episodes with both structured queries (by outcome, tags) and semantic search (by task similarity).
import json
from pathlib import Path
class EpisodicMemoryStore:
def __init__(self, storage_path: str = "episodes.json"):
self.storage_path = Path(storage_path)
self.episodes: List[Episode] = []
self._load()
def record(self, episode: Episode):
"""Store a completed episode with its embedding."""
# Create embedding from task description + outcome for retrieval
summary = (
f"Task: {episode.task_description}. "
f"Outcome: {episode.outcome.value}. "
f"Lessons: {episode.lessons_learned}"
)
episode.embedding = embed_text(summary)
self.episodes.append(episode)
self._save()
def find_similar(
self,
task_description: str,
top_k: int = 3,
outcome_filter: Optional[Outcome] = None,
) -> List[Episode]:
"""Find past episodes similar to a given task."""
query_embedding = embed_text(task_description)
scored = []
for ep in self.episodes:
if outcome_filter and ep.outcome != outcome_filter:
continue
if ep.embedding is None:
continue
sim = cosine_similarity(query_embedding, ep.embedding)
scored.append((ep, sim))
scored.sort(key=lambda x: x[1], reverse=True)
return [ep for ep, _ in scored[:top_k]]
def get_success_patterns(self, task_description: str) -> List[Episode]:
"""Retrieve only successful episodes similar to the current task."""
return self.find_similar(
task_description, top_k=3, outcome_filter=Outcome.SUCCESS
)
def get_failure_warnings(self, task_description: str) -> List[Episode]:
"""Retrieve failed episodes to learn what to avoid."""
return self.find_similar(
task_description, top_k=2, outcome_filter=Outcome.FAILURE
)
def _save(self):
data = []
for ep in self.episodes:
data.append({
"task_description": ep.task_description,
"steps": [
{"action": s.action, "detail": s.detail,
"result": s.result, "error": s.error}
for s in ep.steps
],
"outcome": ep.outcome.value,
"outcome_detail": ep.outcome_detail,
"lessons_learned": ep.lessons_learned,
"tags": ep.tags,
"created_at": ep.created_at.isoformat(),
})
self.storage_path.write_text(json.dumps(data, indent=2))
def _load(self):
if not self.storage_path.exists():
return
data = json.loads(self.storage_path.read_text())
for item in data:
ep = Episode(
task_description=item["task_description"],
outcome=Outcome(item["outcome"]),
outcome_detail=item.get("outcome_detail", ""),
lessons_learned=item.get("lessons_learned", ""),
tags=item.get("tags", []),
)
for step_data in item.get("steps", []):
ep.add_step(**step_data)
# Re-embed on load
summary = f"Task: {ep.task_description}. Outcome: {ep.outcome.value}."
ep.embedding = embed_text(summary)
self.episodes.append(ep)
Integrating Episodic Memory into Agent Loops
The real power comes from using past episodes to inform current decisions. Before the agent acts, it retrieves similar past experiences and includes them in its prompt.
See AI Voice Agents Handle Real Calls
Book a free demo or calculate how much you can save with AI voice automation.
async def run_agent_with_memory(
task: str,
agent_llm,
episodic_store: EpisodicMemoryStore,
) -> Episode:
current_episode = Episode(task_description=task)
# Retrieve relevant past experiences
successes = episodic_store.get_success_patterns(task)
failures = episodic_store.get_failure_warnings(task)
context_parts = []
if successes:
context_parts.append("Relevant successful approaches from past tasks:")
for ep in successes:
context_parts.append(
f"- Task: {ep.task_description} -> {ep.lessons_learned}"
)
if failures:
context_parts.append("Past failures to avoid:")
for ep in failures:
context_parts.append(
f"- Task: {ep.task_description} -> {ep.lessons_learned}"
)
memory_context = "\n".join(context_parts) if context_parts else ""
# Build prompt with episodic context
prompt = f"""Task: {task}
{memory_context}
Based on the task and any relevant past experience above, plan and execute."""
# Execute the agent loop (simplified)
result = await agent_llm.run(prompt)
current_episode.add_step("llm_response", result.output)
current_episode.complete(
outcome=Outcome.SUCCESS,
detail=result.output[:200],
lessons=f"Approach that worked for '{task[:50]}': {result.output[:100]}",
)
# Store the episode for future reference
episodic_store.record(current_episode)
return current_episode
The Learning Loop
Episodic memory creates a natural learning loop: the agent tries something, records the outcome, and uses that record to improve future attempts. Over dozens or hundreds of episodes, the agent accumulates practical wisdom about what works and what fails in its specific domain.
Key principles for effective episodic learning:
- Record both successes and failures — failures are often more informative
- Extract explicit lessons — do not just store what happened; store what was learned
- Keep episodes searchable — embed the task description plus outcome for accurate retrieval
- Prune old episodes — remove outdated episodes when the environment or tools change
FAQ
How is episodic memory different from few-shot prompting?
Few-shot prompting uses fixed, hand-crafted examples. Episodic memory dynamically retrieves the most relevant past experiences for each new task. The examples evolve as the agent gains more experience, making it adaptive rather than static.
How many past episodes should I inject into the prompt?
Start with 2-3 successful examples and 1-2 failure warnings. More than 5 total episodes risks consuming too much of the context window. Prioritize the most similar and most recent episodes.
Can episodic memory replace fine-tuning?
For many use cases, yes. Episodic memory achieves similar personalization without the cost and complexity of fine-tuning. However, fine-tuning changes the model's weights permanently, while episodic memory requires retrieval at runtime. For very frequent patterns, fine-tuning may be more efficient.
#EpisodicMemory #AgentLearning #FeedbackLoops #AIMemory #AgenticAI #LearnAI #AIEngineering
CallSphere Team
Expert insights on AI voice agents and customer communication automation.
Try CallSphere AI Voice Agents
See how AI voice agents work for your industry. Live demo available -- no signup required.