AI Flashcard Agent: Automatic Card Generation and Intelligent Review Scheduling
Build an AI agent that extracts key concepts from study material, generates effective flashcards, and schedules reviews using the SM-2 algorithm with performance analytics.
Why AI-Generated Flashcards Beat Manual Ones
Creating effective flashcards is a skill that most students never develop. Common mistakes include cards that are too broad ("Explain photosynthesis"), cards that test recognition instead of recall ("Photosynthesis converts ___ into ___"), and cards that lack meaningful connections to other concepts. An AI flashcard agent solves these problems by applying evidence-based card creation principles automatically and scheduling reviews with the SM-2 algorithm for optimal long-term retention.
Flashcard Data Model
A good flashcard system needs more than front and back text. Each card should carry metadata for scheduling, difficulty estimation, and performance tracking:
from dataclasses import dataclass, field
from datetime import datetime, timedelta
from typing import Optional
from enum import Enum
import math
class CardType(str, Enum):
BASIC = "basic" # Front -> Back
CLOZE = "cloze" # Fill in the blank
REVERSED = "reversed" # Tests both directions
IMAGE_OCCLUSION = "image" # Hide part of diagram
@dataclass
class Flashcard:
card_id: str
front: str
back: str
card_type: CardType = CardType.BASIC
tags: list[str] = field(default_factory=list)
source_text: str = "" # Original text this was generated from
created_at: datetime = field(default_factory=datetime.now)
# SM-2 scheduling fields
interval: float = 0.0 # Days until next review
repetitions: int = 0 # Consecutive correct answers
easiness: float = 2.5 # Easiness factor (>= 1.3)
next_review: Optional[datetime] = None
last_review: Optional[datetime] = None
# Performance analytics
total_reviews: int = 0
correct_reviews: int = 0
average_response_time: float = 0.0 # seconds
lapse_count: int = 0 # Times forgotten after learning
@property
def retention_rate(self) -> float:
if self.total_reviews == 0:
return 0.0
return self.correct_reviews / self.total_reviews
@property
def is_due(self) -> bool:
if self.next_review is None:
return True
return datetime.now() >= self.next_review
The SM-2 Algorithm Implementation
The SM-2 algorithm is the foundation of Anki and most modern spaced repetition systems. Here is a clean implementation:
def sm2_update(card: Flashcard, quality: int) -> Flashcard:
"""Apply the SM-2 algorithm to update card scheduling.
quality: 0-5 rating
0 = complete blackout
1 = incorrect, but recognized answer
2 = incorrect, but answer seemed easy to recall
3 = correct with serious difficulty
4 = correct after hesitation
5 = perfect response
"""
card.total_reviews += 1
card.last_review = datetime.now()
if quality >= 3:
# Correct response
card.correct_reviews += 1
if card.repetitions == 0:
card.interval = 1.0
elif card.repetitions == 1:
card.interval = 6.0
else:
card.interval = card.interval * card.easiness
card.repetitions += 1
else:
# Incorrect response — reset
card.lapse_count += 1
card.repetitions = 0
card.interval = 1.0 # Review again tomorrow
# Update easiness factor
card.easiness = max(
1.3,
card.easiness + 0.1
- (5 - quality) * (0.08 + (5 - quality) * 0.02),
)
card.next_review = datetime.now() + timedelta(days=card.interval)
return card
Content Extraction and Card Generation
The agent needs to extract testable facts from source material and transform them into well-formed flashcards. The key principle is the "minimum information principle" — each card should test exactly one piece of knowledge:
from agents import Agent, Runner
from pydantic import BaseModel
import json
class GeneratedCard(BaseModel):
front: str
back: str
card_type: str
tags: list[str]
rationale: str # Why this fact is worth a card
class CardBatch(BaseModel):
cards: list[GeneratedCard]
source_summary: str
concepts_covered: list[str]
card_generator = Agent(
name="Flashcard Generator",
instructions="""You create effective flashcards from study material.
Follow these evidence-based principles:
CARD CREATION RULES:
1. MINIMUM INFORMATION: Each card tests exactly ONE fact or concept.
Bad: "What are the three branches of government?"
Good: Three separate cards, one per branch.
2. NO ORPHAN CARDS: Every card should connect to at least one other
concept. Add tags to show relationships.
3. CLOZE FOR DEFINITIONS: Use cloze deletion (fill-in-blank) for
definitions and formulas. Format: "The {{c1::mitochondria}} is the
powerhouse of the cell."
4. REVERSED FOR VOCABULARY: Create both directions for terminology.
Term->Definition AND Definition->Term.
5. CONTEXT MATTERS: Include enough context on the front that the
question is unambiguous without the back.
6. AVOID YES/NO: Never create cards where the answer is just yes or
no. Rephrase to require recalling the actual fact.
Generate a mix of basic, cloze, and reversed cards. Aim for 3-5 cards
per distinct concept in the source material.""",
output_type=CardBatch,
)
async def generate_cards_from_text(text: str) -> CardBatch:
result = await Runner.run(
card_generator,
f"Generate flashcards from this study material:\n\n{text}",
)
return result.final_output_as(CardBatch)
Review Session Manager
The review session presents due cards, collects quality ratings, and updates the schedule:
See AI Voice Agents Handle Real Calls
Book a free demo or calculate how much you can save with AI voice automation.
from agents import function_tool
@dataclass
class ReviewSession:
cards_reviewed: int = 0
correct: int = 0
incorrect: int = 0
total_time: float = 0.0
cards: list[Flashcard] = field(default_factory=list)
@property
def accuracy(self) -> float:
if self.cards_reviewed == 0:
return 0.0
return self.correct / self.cards_reviewed
def get_review_queue(
all_cards: list[Flashcard], max_reviews: int = 20
) -> list[Flashcard]:
"""Get cards due for review, ordered by priority."""
due = [c for c in all_cards if c.is_due]
# Priority: overdue cards first, then by lapse count
due.sort(key=lambda c: (
-(c.next_review or datetime.min).timestamp()
if c.next_review else float('inf'),
-c.lapse_count,
))
return due[:max_reviews]
@function_tool
def submit_review(
card_id: str,
quality: int,
response_time_seconds: float,
) -> str:
"""Submit a review result and get the next review date."""
card = card_database.get(card_id)
if not card:
return json.dumps({"error": "card not found"})
card = sm2_update(card, quality)
# Update running average response time
n = card.total_reviews
card.average_response_time = (
(card.average_response_time * (n - 1) + response_time_seconds) / n
)
return json.dumps({
"card_id": card_id,
"next_review": card.next_review.isoformat(),
"interval_days": round(card.interval, 1),
"easiness": round(card.easiness, 2),
"retention_rate": f"{card.retention_rate:.0%}",
})
Performance Analytics
Track learning trends to give the student insight into their study effectiveness:
def compute_deck_analytics(cards: list[Flashcard]) -> dict:
"""Compute analytics across an entire card deck."""
if not cards:
return {"total_cards": 0}
mature = [c for c in cards if c.interval >= 21]
young = [c for c in cards if 0 < c.interval < 21]
new = [c for c in cards if c.repetitions == 0]
leeches = [c for c in cards if c.lapse_count >= 5]
avg_retention = sum(c.retention_rate for c in cards) / len(cards)
avg_easiness = sum(c.easiness for c in cards) / len(cards)
return {
"total_cards": len(cards),
"mature_cards": len(mature),
"young_cards": len(young),
"new_cards": len(new),
"leech_cards": len(leeches),
"average_retention": f"{avg_retention:.0%}",
"average_easiness": round(avg_easiness, 2),
"forecast_reviews_7d": sum(
1 for c in cards
if c.next_review
and c.next_review <= datetime.now() + timedelta(days=7)
),
"leech_topics": list(set(
tag for c in leeches for tag in c.tags
)),
}
FAQ
What is a "leech" card and how should the agent handle them?
A leech is a card that the student keeps forgetting despite multiple reviews — typically defined as a card with five or more lapses. Leeches indicate that the card is poorly formed, tests something too complex for a single card, or covers material the student lacks prerequisites for. The agent should flag leeches for reformulation, suggesting that the card be broken into smaller sub-cards or rephrased with a different approach like adding a mnemonic.
How does the SM-2 algorithm compare to newer alternatives like FSRS?
SM-2 is simpler and well-proven but treats all students the same — the initial intervals and easiness factor adjustments are fixed constants. FSRS (Free Spaced Repetition Scheduler) uses machine learning to personalize the memory model parameters per student, which typically results in 10-30% fewer reviews for the same retention. For an AI agent, SM-2 is a solid starting point; you can upgrade to FSRS once you have collected enough review data to train the model.
Should cards be generated from raw text or from pre-processed notes?
Pre-processed notes generally produce better cards because they already represent the student's understanding of what is important. However, the AI card generator can work well with raw text if the extraction instructions are detailed enough. The two-stage approach in the code above — first extracting concepts, then generating cards — handles raw text effectively by filtering out non-essential content before card creation.
#Flashcards #SM2Algorithm #StudyTools #Python #AgenticAI #LearnAI #AIEngineering
CallSphere Team
Expert insights on AI voice agents and customer communication automation.
Try CallSphere AI Voice Agents
See how AI voice agents work for your industry. Live demo available -- no signup required.