AI Agent for Interview Preparation: Mock Interviews with Adaptive Questions
Build an AI mock interview agent with adaptive question selection, response evaluation across behavioral and technical tracks, and detailed performance feedback with improvement coaching.
Why Mock Interviews Need Adaptive AI
Practicing for interviews with a static question list misses the most important dynamic: real interviews adapt. If a candidate gives a strong answer about system design, the interviewer probes deeper. If they struggle with concurrency, the interviewer might simplify or pivot to gauge the candidate's boundaries. An AI mock interview agent replicates this adaptive behavior, selecting follow-up questions based on the candidate's demonstrated strengths and weaknesses.
Interview Configuration Model
Start by defining the interview structure and question bank:
from dataclasses import dataclass, field
from enum import Enum
from typing import Optional
class InterviewTrack(str, Enum):
BEHAVIORAL = "behavioral"
TECHNICAL = "technical"
SYSTEM_DESIGN = "system_design"
CODING = "coding"
class DifficultyLevel(str, Enum):
EASY = "easy"
MEDIUM = "medium"
HARD = "hard"
EXPERT = "expert"
@dataclass
class InterviewQuestion:
question_id: str
text: str
track: InterviewTrack
difficulty: DifficultyLevel
follow_ups: list[str] = field(default_factory=list)
evaluation_criteria: list[str] = field(default_factory=list)
ideal_answer_points: list[str] = field(default_factory=list)
time_limit_minutes: int = 5
tags: list[str] = field(default_factory=list)
@dataclass
class InterviewConfig:
role: str
company: str
tracks: list[InterviewTrack]
total_duration_minutes: int = 45
difficulty_start: DifficultyLevel = DifficultyLevel.MEDIUM
questions_per_track: int = 3
BEHAVIORAL_QUESTIONS = [
InterviewQuestion(
question_id="beh-001",
text="Tell me about a time you had to make a difficult "
"technical decision with incomplete information.",
track=InterviewTrack.BEHAVIORAL,
difficulty=DifficultyLevel.MEDIUM,
follow_ups=[
"What data would have changed your decision?",
"How did you communicate the risk to stakeholders?",
"What would you do differently now?",
],
evaluation_criteria=[
"Uses STAR format (Situation, Task, Action, Result)",
"Shows decision-making process, not just outcome",
"Demonstrates learning and self-awareness",
"Quantifies impact where possible",
],
ideal_answer_points=[
"Specific situation with context",
"Clear description of the tradeoffs considered",
"Action taken with reasoning",
"Measurable result or outcome",
"Reflection on what was learned",
],
),
]
Adaptive Question Selection
The question selector adjusts difficulty and topic based on how the candidate has performed so far:
@dataclass
class CandidatePerformance:
candidate_id: str
role: str
answers: list[dict] = field(default_factory=list)
track_scores: dict[str, list[float]] = field(default_factory=dict)
current_difficulty: dict[str, str] = field(default_factory=dict)
asked_questions: set[str] = field(default_factory=set)
def get_track_average(self, track: str) -> float:
scores = self.track_scores.get(track, [])
if not scores:
return 0.5
return sum(scores) / len(scores)
def adjust_difficulty(self, track: str, score: float):
"""Increase difficulty on strong answers, decrease on weak."""
levels = list(DifficultyLevel)
current = DifficultyLevel(
self.current_difficulty.get(track, "medium")
)
current_idx = levels.index(current)
if score >= 0.8 and current_idx < len(levels) - 1:
self.current_difficulty[track] = levels[current_idx + 1].value
elif score < 0.4 and current_idx > 0:
self.current_difficulty[track] = levels[current_idx - 1].value
def select_next_question(
performance: CandidatePerformance,
question_bank: list[InterviewQuestion],
track: InterviewTrack,
) -> Optional[InterviewQuestion]:
"""Select the next question based on adaptive difficulty."""
target_difficulty = DifficultyLevel(
performance.current_difficulty.get(track.value, "medium")
)
# Filter to unasked questions in the right track and difficulty
candidates = [
q for q in question_bank
if q.track == track
and q.difficulty == target_difficulty
and q.question_id not in performance.asked_questions
]
if not candidates:
# Fall back to adjacent difficulties
candidates = [
q for q in question_bank
if q.track == track
and q.question_id not in performance.asked_questions
]
if not candidates:
return None
# Prefer questions that test areas where candidate is weakest
return candidates[0]
Response Evaluation Agent
The evaluation agent scores candidate responses against structured criteria:
from agents import Agent, Runner
from pydantic import BaseModel
class ResponseEvaluation(BaseModel):
score: float # 0.0 to 1.0
criteria_met: list[str]
criteria_missed: list[str]
strengths: list[str]
improvement_areas: list[str]
follow_up_question: Optional[str] = None
detailed_feedback: str
behavioral_evaluator = Agent(
name="Behavioral Interview Evaluator",
instructions="""You evaluate behavioral interview answers. Score
each response on a 0.0-1.0 scale using these criteria:
STAR FORMAT (25% of score):
- Clear Situation with relevant context
- Specific Task or challenge described
- Detailed Actions taken (not just "we" — what did YOU do?)
- Quantifiable Results or outcomes
COMMUNICATION (25% of score):
- Concise and focused (2-3 minutes ideal)
- Logical narrative flow
- Appropriate level of detail
SUBSTANCE (25% of score):
- Shows relevant skills for the role
- Demonstrates impact and ownership
- Includes specific metrics or outcomes
SELF-AWARENESS (25% of score):
- Acknowledges challenges honestly
- Shows learning and growth
- Demonstrates adaptability
If the answer is strong, generate a harder follow-up question that
probes deeper. If weak, note specific coaching advice.""",
output_type=ResponseEvaluation,
)
technical_evaluator = Agent(
name="Technical Interview Evaluator",
instructions="""You evaluate technical interview answers. Score
each response on a 0.0-1.0 scale using these criteria:
ACCURACY (30% of score):
- Technically correct statements
- Proper use of terminology
- No critical misconceptions
DEPTH (25% of score):
- Goes beyond surface-level explanation
- Discusses tradeoffs and alternatives
- Shows understanding of WHY, not just WHAT
PROBLEM SOLVING (25% of score):
- Structured approach to the problem
- Considers edge cases
- Breaks down complex problems into parts
COMMUNICATION (20% of score):
- Explains technical concepts clearly
- Uses appropriate abstractions
- Asks clarifying questions when needed
Generate a follow-up that tests the boundaries of their knowledge.""",
output_type=ResponseEvaluation,
)
The Mock Interview Agent
Combine question selection, the interview flow, and evaluation into a cohesive interview session:
See AI Voice Agents Handle Real Calls
Book a free demo or calculate how much you can save with AI voice automation.
from agents import function_tool
import json
@function_tool
def evaluate_answer(
question_id: str,
candidate_answer: str,
track: str,
evaluation_criteria: str,
) -> str:
"""Evaluate a candidate's interview answer."""
# In production, dispatch to behavioral or technical evaluator
return json.dumps({
"status": "evaluated",
"question_id": question_id,
})
def build_interviewer_instructions(
config: InterviewConfig,
performance: CandidatePerformance,
current_question: InterviewQuestion,
) -> str:
track_avg = performance.get_track_average(
current_question.track.value
)
return f"""You are conducting a mock interview for a
{config.role} position at {config.company}.
CURRENT QUESTION: {current_question.text}
TRACK: {current_question.track.value}
DIFFICULTY: {current_question.difficulty.value}
EVALUATION CRITERIA:
{chr(10).join(f'- {c}' for c in current_question.evaluation_criteria)}
IDEAL ANSWER SHOULD COVER:
{chr(10).join(f'- {p}' for p in current_question.ideal_answer_points)}
Candidate's average performance in this track: {track_avg:.0%}
INTERVIEWER BEHAVIOR:
- Act like a real interviewer — professional but friendly
- Listen actively and ask relevant follow-up questions
- If the candidate is struggling, provide a gentle nudge
- After the candidate finishes, provide structured feedback
- Use follow-up questions to probe depth: {current_question.follow_ups}
- Never reveal the ideal answer — coach toward it instead"""
mock_interviewer = Agent(
name="Mock Interviewer",
instructions="Dynamic — set per question",
tools=[evaluate_answer],
)
Post-Interview Feedback Report
After the session, generate a comprehensive feedback report:
def generate_interview_report(
performance: CandidatePerformance,
) -> dict:
"""Generate a post-interview performance report."""
report = {
"role": performance.role,
"questions_answered": len(performance.answers),
"track_performance": {},
"overall_score": 0.0,
"top_strengths": [],
"priority_improvements": [],
"practice_recommendations": [],
}
total_score = 0.0
for track, scores in performance.track_scores.items():
avg = sum(scores) / len(scores) if scores else 0
total_score += avg
report["track_performance"][track] = {
"average_score": f"{avg:.0%}",
"questions_answered": len(scores),
"trend": "improving" if len(scores) > 1
and scores[-1] > scores[0] else "stable",
}
num_tracks = len(performance.track_scores) or 1
report["overall_score"] = f"{total_score / num_tracks:.0%}"
return report
FAQ
How does the agent handle it when a candidate goes off-topic or gives an irrelevant answer?
The evaluator scores off-topic answers low on the "substance" and "accuracy" criteria, which triggers the difficulty adjustment to lower the next question's complexity. The interviewer agent is also instructed to gently redirect — just as a real interviewer would say "That is interesting, but I am specifically asking about..." The system prompt includes the evaluation criteria so the agent knows exactly what a relevant answer should address.
Can the agent simulate different interviewer styles like friendly vs. challenging?
Yes. The interviewer persona is controlled by the system prompt. A "challenging" mode adds instructions like "Push back on vague statements, ask for specific numbers, and express skepticism that forces the candidate to defend their claims." A "friendly" mode uses more encouragement and broader follow-up questions. Candidates should practice with both styles to prepare for real interview variability.
How do you ensure the mock interview covers enough breadth in a limited time?
The InterviewConfig specifies questions per track and total duration. The session manager tracks elapsed time and ensures each track gets proportional coverage. If the candidate spends too long on one question, the agent notes it in feedback and moves to the next topic — mirroring real interview time management expectations.
#InterviewPrep #MockInterviews #CareerAI #Python #AgenticAI #LearnAI #AIEngineering
CallSphere Team
Expert insights on AI voice agents and customer communication automation.
Try CallSphere AI Voice Agents
See how AI voice agents work for your industry. Live demo available -- no signup required.