Building a Language Learning Agent: Conversational Practice with AI

Why Conversational Practice Is the Missing Piece

Language learners consistently report the same bottleneck: they can read grammar rules and memorize vocabulary, but freeze in actual conversation. The gap between knowing a language and using it comes down to practice with a patient, adaptive conversation partner who corrects mistakes without derailing the flow.

An AI language learning agent fills this role by simulating realistic conversations, providing inline error corrections, tracking which vocabulary and grammar structures the learner has mastered, and gradually increasing complexity as the learner improves.

Learner Profile and Vocabulary Tracker

The agent needs to track what the learner knows so it can introduce new words at the right pace and recycle ones that need reinforcement:

from dataclasses import dataclass, field
from datetime import datetime, timedelta
from enum import Enum
from typing import Optional

class CEFR(str, Enum):
    A1 = "A1"  # Beginner
    A2 = "A2"  # Elementary
    B1 = "B1"  # Intermediate
    B2 = "B2"  # Upper Intermediate
    C1 = "C1"  # Advanced
    C2 = "C2"  # Mastery

@dataclass
class VocabEntry:
    word: str
    translation: str
    times_seen: int = 0
    times_used_correctly: int = 0
    last_seen: Optional[datetime] = None
    next_review: Optional[datetime] = None

    @property
    def strength(self) -> float:
        if self.times_seen == 0:
            return 0.0
        base = self.times_used_correctly / self.times_seen
        # Decay if not reviewed recently
        if self.last_seen:
            days_since = (datetime.now() - self.last_seen).days
            decay = max(0, 1 - (days_since / 30))
            return base * decay
        return base

@dataclass
class LearnerProfile:
    learner_id: str
    target_language: str
    native_language: str
    level: CEFR = CEFR.A1
    vocabulary: dict[str, VocabEntry] = field(default_factory=dict)
    grammar_errors: list[dict] = field(default_factory=list)
    conversation_count: int = 0
    total_messages: int = 0

    def get_weak_vocab(self, limit: int = 10) -> list[VocabEntry]:
        """Words that need more practice, sorted by weakness."""
        entries = list(self.vocabulary.values())
        return sorted(entries, key=lambda e: e.strength)[:limit]

    def get_due_reviews(self) -> list[VocabEntry]:
        """Words due for spaced repetition review."""
        now = datetime.now()
        return [
            v for v in self.vocabulary.values()
            if v.next_review and v.next_review <= now
        ]

    def record_vocab_use(self, word: str, correct: bool):
        if word in self.vocabulary:
            entry = self.vocabulary[word]
            entry.times_seen += 1
            entry.last_seen = datetime.now()
            if correct:
                entry.times_used_correctly += 1
                # Extend next review using spaced repetition
                interval = 2 ** entry.times_used_correctly
                entry.next_review = (
                    datetime.now() + timedelta(days=interval)
                )

Conversation Simulation Agent

The core agent maintains a conversation in the target language while adapting to the learner's level. The system prompt dynamically adjusts based on proficiency:

from agents import Agent, Runner, function_tool
import json

LEVEL_GUIDELINES = {
    "A1": {
        "vocab": "basic everyday words (100-500 word range)",
        "grammar": "present tense, simple sentences, basic questions",
        "topics": "greetings, family, food, numbers, colors",
        "speed": "short sentences, max 8 words per sentence",
    },
    "A2": {
        "vocab": "common everyday vocabulary (500-1000 word range)",
        "grammar": "past tense, future with 'going to', conjunctions",
        "topics": "daily routines, shopping, travel, weather",
        "speed": "moderate sentences, max 12 words",
    },
    "B1": {
        "vocab": "intermediate vocabulary with some abstract words",
        "grammar": "conditionals, passive voice, relative clauses",
        "topics": "opinions, experiences, plans, current events",
        "speed": "natural sentence length, varied structure",
    },
    "B2": {
        "vocab": "broad vocabulary including idiomatic expressions",
        "grammar": "subjunctive, complex conditionals, reported speech",
        "topics": "abstract topics, debate, nuanced opinions",
        "speed": "natural and varied, including complex sentences",
    },
}

def build_conversation_instructions(
    profile: LearnerProfile, scenario: str
) -> str:
    level = profile.level.value
    guidelines = LEVEL_GUIDELINES.get(level, LEVEL_GUIDELINES["B1"])
    weak_vocab = [v.word for v in profile.get_weak_vocab(5)]

    return f"""You are a friendly conversation partner helping someone
practice {profile.target_language}. Their native language is
{profile.native_language}. Current level: {level}.

SCENARIO: {scenario}

LANGUAGE GUIDELINES:
- Vocabulary range: {guidelines['vocab']}
- Grammar to use: {guidelines['grammar']}
- Suitable topics: {guidelines['topics']}
- Sentence complexity: {guidelines['speed']}

TEACHING APPROACH:
- Respond naturally in {profile.target_language}
- If the learner makes an error, gently correct it inline using
  this format: [correction: wrong -> right (brief explanation)]
- Then continue the conversation naturally
- Try to naturally incorporate these weak vocabulary words that the
  learner needs to practice: {weak_vocab}
- If the learner seems stuck, offer a hint in {profile.native_language}
- Never switch entirely to {profile.native_language} — keep the
  conversation primarily in the target language
- Ask follow-up questions to keep the conversation flowing"""

Error Correction and Tracking Tools

The agent needs tools to log errors and vocabulary usage for long-term tracking:

See AI Voice Agents Handle Real Calls

Book a free demo or calculate how much you can save with AI voice automation.

Book a Demo ROI Calculator

@function_tool
def log_grammar_error(
    learner_id: str,
    error_type: str,
    incorrect: str,
    corrected: str,
    explanation: str,
) -> str:
    """Log a grammar or vocabulary error for tracking patterns."""
    error = {
        "type": error_type,
        "incorrect": incorrect,
        "corrected": corrected,
        "explanation": explanation,
        "timestamp": datetime.now().isoformat(),
    }
    # In production this would write to a database
    return json.dumps({"status": "logged", "error": error})

@function_tool
def record_vocabulary_usage(
    learner_id: str,
    word: str,
    translation: str,
    used_correctly: bool,
) -> str:
    """Track when the learner uses a vocabulary word."""
    # In production, look up from database
    profile = learner_profiles.get(learner_id)
    if not profile:
        return json.dumps({"error": "learner not found"})

    if word not in profile.vocabulary:
        profile.vocabulary[word] = VocabEntry(
            word=word, translation=translation
        )
    profile.record_vocab_use(word, used_correctly)
    entry = profile.vocabulary[word]

    return json.dumps({
        "word": word,
        "strength": f"{entry.strength:.0%}",
        "times_seen": entry.times_seen,
    })

Level Adaptation Logic

After each conversation session, assess whether the learner should be promoted or given additional support at their current level:

def assess_level_change(profile: LearnerProfile) -> Optional[CEFR]:
    """Determine if the learner should advance to the next CEFR level."""
    recent_errors = [
        e for e in profile.grammar_errors[-20:]
    ]
    error_rate = len(recent_errors) / max(profile.total_messages, 1)

    strong_vocab = [
        v for v in profile.vocabulary.values() if v.strength > 0.8
    ]
    vocab_strength = len(strong_vocab) / max(len(profile.vocabulary), 1)

    levels = list(CEFR)
    current_idx = levels.index(profile.level)

    # Advance if error rate is low and vocabulary is strong
    if (error_rate < 0.15 and vocab_strength > 0.7
            and profile.conversation_count >= 10
            and current_idx < len(levels) - 1):
        return levels[current_idx + 1]

    return None

FAQ

How does the agent avoid overcorrecting and discouraging the learner?

The system prompt instructs the agent to correct only significant errors that impede understanding and to use inline corrections that blend into the natural conversation flow. Minor errors like accent marks or article usage at lower levels are noted in the tracking system but not flagged in conversation. The correction-to-encouragement ratio is calibrated — the agent provides positive reinforcement alongside corrections.

Can this approach handle languages with different scripts like Chinese or Arabic?

Yes. The conversation structure is language-agnostic. For logographic or non-Latin scripts, you would extend the VocabEntry model to include fields for pronunciation (pinyin, romanization), stroke order, or script variants. The level guidelines would also be adjusted since CEFR is designed for European languages — HSK levels or similar frameworks can replace it for Chinese.

How do you ensure conversations feel natural rather than scripted?

The scenario-based approach is key. Instead of generic conversation, each session simulates a specific real-world situation like ordering at a restaurant or asking for directions. The agent is instructed to respond naturally within the scenario context, which creates more authentic conversational patterns than topic-free chat.

#LanguageLearning #ConversationalAI #EducationAI #Python #NLP #AgenticAI #LearnAI #AIEngineering

Building a Language Learning Agent: Conversational Practice with AI

Why Conversational Practice Is the Missing Piece

Learner Profile and Vocabulary Tracker

Conversation Simulation Agent

Error Correction and Tracking Tools

Level Adaptation Logic

FAQ

How does the agent avoid overcorrecting and discouraging the learner?

Can this approach handle languages with different scripts like Chinese or Arabic?

How do you ensure conversations feel natural rather than scripted?

Try CallSphere AI Voice Agents

Related Articles

WebArena and Real-World Web Agent Benchmarks: How We Measure Browser Agent Performance

Taking Screenshots and Recording Videos with Playwright for AI Analysis

Playwright Selectors Deep Dive: CSS, XPath, Text, and Role-Based Element Finding