AI Code Review Agent: Automated Pull Request Analysis and Feedback
Build an AI agent that automatically reviews pull requests, parses diffs, classifies issues by severity, and generates actionable feedback comments. A practical guide to augmenting code review workflows.
The Problem with Manual Code Review
Code review is essential but slow. Reviewers get fatigued, miss subtle bugs, and spend time flagging style issues that a machine could catch. An AI code review agent automates the mechanical parts of review — spotting bugs, identifying security issues, enforcing conventions — so human reviewers can focus on architecture and design decisions.
The goal is not to replace human reviewers. It is to give them a head start by surfacing the most important issues before they even open the PR.
Designing the Review Agent
The agent takes a unified diff as input, parses it into individual file changes, analyzes each change against a set of review criteria, and produces structured feedback with severity levels.
from dataclasses import dataclass
from enum import Enum
from openai import OpenAI
client = OpenAI()
class Severity(Enum):
CRITICAL = "critical"
WARNING = "warning"
SUGGESTION = "suggestion"
NITPICK = "nitpick"
@dataclass
class ReviewComment:
file: str
line: int
severity: Severity
message: str
suggestion: str | None = None
class CodeReviewAgent:
def __init__(self, model: str = "gpt-4o"):
self.model = model
self.review_criteria = [
"security_vulnerabilities",
"bug_detection",
"performance_issues",
"error_handling",
"code_style",
]
Parsing Unified Diffs
Before the LLM can review code, you need to parse the diff into structured chunks that include file paths and line numbers.
import re
def parse_diff(self, diff_text: str) -> list[dict]:
files = []
current_file = None
current_hunks = []
for line in diff_text.split("\n"):
if line.startswith("diff --git"):
if current_file:
files.append({
"file": current_file,
"hunks": current_hunks,
})
match = re.search(r"b/(.+)$", line)
current_file = match.group(1) if match else "unknown"
current_hunks = []
elif line.startswith("@@"):
match = re.search(r"\+(\d+)", line)
start_line = int(match.group(1)) if match else 1
current_hunks.append({
"start_line": start_line,
"lines": [],
})
elif current_hunks and not line.startswith("---"):
current_hunks[-1]["lines"].append(line)
if current_file:
files.append({"file": current_file, "hunks": current_hunks})
return files
This parser extracts each modified file and its hunks, preserving line numbers so feedback can reference exact locations.
See AI Voice Agents Handle Real Calls
Book a free demo or calculate how much you can save with AI voice automation.
Analyzing Changes with the LLM
Each file's changes are sent to the LLM with a structured prompt that forces output into a parseable format.
import json
def review_file(self, file_info: dict) -> list[ReviewComment]:
diff_content = ""
for hunk in file_info["hunks"]:
diff_content += f"Starting at line {hunk['start_line']}:\n"
diff_content += "\n".join(hunk["lines"]) + "\n\n"
system_prompt = """You are a senior code reviewer. Analyze the diff and
return a JSON array of issues found. Each issue must have:
- "line": integer line number
- "severity": one of "critical", "warning", "suggestion", "nitpick"
- "message": clear explanation of the issue
- "suggestion": optional fix or improved code
Focus on: security vulnerabilities, bugs, performance problems,
missing error handling, and convention violations.
Return [] if no issues found. Output ONLY valid JSON."""
response = client.chat.completions.create(
model=self.model,
messages=[
{"role": "system", "content": system_prompt},
{"role": "user", "content": (
f"File: {file_info['file']}\n\n{diff_content}"
)},
],
temperature=0,
response_format={"type": "json_object"},
)
raw = json.loads(response.choices[0].message.content)
issues = raw if isinstance(raw, list) else raw.get("issues", [])
return [
ReviewComment(
file=file_info["file"],
line=issue.get("line", 0),
severity=Severity(issue.get("severity", "suggestion")),
message=issue["message"],
suggestion=issue.get("suggestion"),
)
for issue in issues
]
Prioritizing and Formatting Output
Raw review comments need to be sorted by severity so the developer sees critical issues first.
def review_pull_request(self, diff_text: str) -> list[ReviewComment]:
files = self.parse_diff(diff_text)
all_comments = []
for file_info in files:
comments = self.review_file(file_info)
all_comments.extend(comments)
severity_order = {
Severity.CRITICAL: 0,
Severity.WARNING: 1,
Severity.SUGGESTION: 2,
Severity.NITPICK: 3,
}
all_comments.sort(key=lambda c: severity_order[c.severity])
return all_comments
def format_review(self, comments: list[ReviewComment]) -> str:
if not comments:
return "No issues found. LGTM!"
lines = []
for c in comments:
icon = {
Severity.CRITICAL: "[CRITICAL]",
Severity.WARNING: "[WARNING]",
Severity.SUGGESTION: "[SUGGESTION]",
Severity.NITPICK: "[NITPICK]",
}[c.severity]
lines.append(f"{icon} {c.file}:{c.line} - {c.message}")
if c.suggestion:
lines.append(f" Suggested fix: {c.suggestion}")
return "\n".join(lines)
FAQ
How do I integrate this agent with GitHub or GitLab?
Use the GitHub API (via the PyGithub library or httpx) to fetch PR diffs, then post review comments back using the PR review comments endpoint. For GitHub Actions, trigger the agent on pull_request events and use the GITHUB_TOKEN for authentication.
Will the LLM produce false positives that annoy developers?
Yes, false positives are the biggest usability risk. Mitigate this by setting the default severity threshold to warning or above when posting comments, and let developers configure ignored rules. Track feedback on comments to tune prompts over time.
How do I handle large PRs that exceed the context window?
Split the diff into per-file chunks and review each file independently. For very large files, split further by hunk. Maintain a summary context across files so the agent understands cross-file dependencies.
#CodeReview #AIAgents #Python #PullRequests #DeveloperWorkflow #AgenticAI #LearnAI #AIEngineering
CallSphere Team
Expert insights on AI voice agents and customer communication automation.
Try CallSphere AI Voice Agents
See how AI voice agents work for your industry. Live demo available -- no signup required.