Building a Language-Switching Agent: Dynamic Language Detection and Response
Build an AI agent that automatically detects language changes mid-conversation, switches response language dynamically, and persists user language preferences across sessions.
The Challenge of Mid-Conversation Language Switching
Users in multilingual environments often switch languages within a single conversation. A bilingual user might start in English, paste a document in Spanish, then ask a follow-up question in English. An agent that locks into one language at conversation start will produce awkward results. A truly global agent must track language on a per-message basis and respond in whatever language the user is currently using.
Per-Message Language Detection
Rather than detecting language once, run detection on every incoming message and maintain a rolling language context.
from dataclasses import dataclass, field
from typing import List, Optional
from langdetect import detect
from collections import Counter
@dataclass
class MessageLanguage:
message_index: int
text_snippet: str
detected_lang: str
confidence: float
@dataclass
class ConversationLanguageTracker:
history: List[MessageLanguage] = field(default_factory=list)
user_explicit_pref: Optional[str] = None
_switch_count: int = 0
def track_message(self, index: int, text: str) -> str:
"""Detect language of a new message and return active language."""
if len(text.strip()) < 10:
# Short messages are unreliable for detection
return self.current_language
try:
lang = detect(text)
except Exception:
return self.current_language
entry = MessageLanguage(
message_index=index,
text_snippet=text[:50],
detected_lang=lang,
confidence=0.9,
)
if self.history and lang != self.history[-1].detected_lang:
self._switch_count += 1
self.history.append(entry)
return self.current_language
@property
def current_language(self) -> str:
if self.user_explicit_pref:
return self.user_explicit_pref
if not self.history:
return "en"
return self.history[-1].detected_lang
@property
def dominant_language(self) -> str:
"""Most frequently used language across the conversation."""
if not self.history:
return "en"
counts = Counter(m.detected_lang for m in self.history)
return counts.most_common(1)[0][0]
@property
def is_multilingual_session(self) -> bool:
return self._switch_count >= 2
Explicit Language Commands
Users should be able to override detection by explicitly requesting a language. Parse commands like "switch to French" or "respond in Japanese."
import re
from typing import Optional, Tuple
LANGUAGE_MAP = {
"english": "en", "spanish": "es", "french": "fr",
"german": "de", "japanese": "ja", "chinese": "zh",
"arabic": "ar", "portuguese": "pt", "korean": "ko",
"hindi": "hi", "italian": "it", "dutch": "nl",
"russian": "ru", "turkish": "tr", "thai": "th",
}
SWITCH_PATTERNS = [
r"(?:switch|change|respond|reply|speak|answer)\s+(?:to|in)\s+(\w+)",
r"(?:use|set)\s+(?:language\s+(?:to\s+)?)?(\w+)",
r"(?:en|in)\s+(\w+)\s+(?:please|por favor|s'il vous plait|bitte)",
]
def parse_language_command(text: str) -> Optional[str]:
"""Extract explicit language switch requests from user input."""
lower = text.lower().strip()
for pattern in SWITCH_PATTERNS:
match = re.search(pattern, lower)
if match:
lang_name = match.group(1)
return LANGUAGE_MAP.get(lang_name)
return None
Session-Aware Language Persistence
Store the user's language preference so it persists across sessions using a simple database-backed store.
import json
from datetime import datetime
from typing import Optional, Dict
class LanguagePreferenceStore:
"""Persist user language preferences across sessions."""
def __init__(self, db_connection):
self.db = db_connection
async def get_preference(self, user_id: str) -> Optional[str]:
row = await self.db.fetchone(
"SELECT language_code FROM user_language_prefs WHERE user_id = $1",
user_id,
)
return row["language_code"] if row else None
async def set_preference(self, user_id: str, lang_code: str) -> None:
await self.db.execute(
"""INSERT INTO user_language_prefs (user_id, language_code, updated_at)
VALUES ($1, $2, $3)
ON CONFLICT (user_id) DO UPDATE
SET language_code = $2, updated_at = $3""",
user_id, lang_code, datetime.utcnow(),
)
async def get_language_stats(self, user_id: str) -> Dict[str, int]:
rows = await self.db.fetch(
"""SELECT detected_lang, COUNT(*) as cnt
FROM message_languages WHERE user_id = $1
GROUP BY detected_lang ORDER BY cnt DESC""",
user_id,
)
return {row["detected_lang"]: row["cnt"] for row in rows}
Integrating Into the Agent Loop
Wire detection, command parsing, and persistence into a single middleware that runs before each agent invocation.
See AI Voice Agents Handle Real Calls
Book a free demo or calculate how much you can save with AI voice automation.
class LanguageSwitchingMiddleware:
def __init__(self, tracker: ConversationLanguageTracker, store: LanguagePreferenceStore):
self.tracker = tracker
self.store = store
async def process_incoming(self, user_id: str, message: str, msg_index: int) -> dict:
# Check for explicit switch commands first
explicit = parse_language_command(message)
if explicit:
self.tracker.user_explicit_pref = explicit
await self.store.set_preference(user_id, explicit)
return {"language": explicit, "switched": True, "explicit": True}
# Auto-detect
detected = self.tracker.track_message(msg_index, message)
return {"language": detected, "switched": False, "explicit": False}
Handling Edge Cases
Short messages like "ok", "yes", or emoji are ambiguous across many languages. The tracker above handles this by requiring a minimum text length of 10 characters before updating the detected language. For code snippets, which are language-neutral, strip code blocks before running detection to avoid false triggers.
import re
FENCE = "~" * 3 # Code fence delimiter
def strip_code_blocks(text: str) -> str:
"""Remove code blocks before language detection."""
pattern = rf"{FENCE}[\s\S]*?{FENCE}"
cleaned = re.sub(pattern, "", text)
cleaned = re.sub(r"`[^`]+`", "", cleaned)
return cleaned.strip()
FAQ
How do I prevent false language switches from pasted content?
Differentiate between the user's own text and pasted content using UI hints (paste events in the frontend) or heuristics (long blocks of text with different formatting). Only update the active response language based on the user's own typed messages, not pasted foreign-language documents.
Should the agent acknowledge a language switch explicitly?
Yes, a brief acknowledgment like "Switching to French" (in French) confirms the switch and prevents confusion. Keep the acknowledgment to one short sentence and then continue with the actual response.
What happens when two languages are mixed in a single message (code-switching)?
Detect the dominant language of the message and respond in that language. If the user consistently mixes two languages (common in bilingual communities), consider responding in the user's preferred base language while naturally incorporating terms from the second language.
#LanguageDetection #DynamicSwitching #SessionManagement #AIAgents #Multilingual #AgenticAI #LearnAI #AIEngineering
CallSphere Team
Expert insights on AI voice agents and customer communication automation.
Try CallSphere AI Voice Agents
See how AI voice agents work for your industry. Live demo available -- no signup required.