Building a Claude Code Review Agent: Automated PR Analysis and Suggestions

Why Automate Code Reviews

Code reviews are critical for code quality, but they create bottlenecks. Reviewers miss subtle bugs when fatigued, junior developers wait days for feedback, and style issues consume review time that could be spent on logic and architecture. A Claude-powered code review agent handles the repetitive parts — style enforcement, bug pattern detection, security scanning, and documentation checks — letting human reviewers focus on design decisions and business logic.

The agent we will build fetches PR diffs from GitHub, analyzes each changed file with Claude, generates specific suggestions with line-level precision, and posts review comments back to the PR.

Fetching PR Diffs from GitHub

Use the GitHub API to get the pull request diff and file changes:

import requests
import os

GITHUB_TOKEN = os.environ["GITHUB_TOKEN"]

def get_pr_diff(owner: str, repo: str, pr_number: int) -> dict:
    """Fetch PR details and file diffs."""
    headers = {
        "Authorization": f"token {GITHUB_TOKEN}",
        "Accept": "application/vnd.github.v3+json",
    }

    # Get PR metadata
    pr_url = f"https://api.github.com/repos/{owner}/{repo}/pulls/{pr_number}"
    pr_data = requests.get(pr_url, headers=headers).json()

    # Get changed files with patches
    files_url = f"{pr_url}/files"
    files = requests.get(files_url, headers=headers).json()

    return {
        "title": pr_data["title"],
        "description": pr_data.get("body", ""),
        "base_branch": pr_data["base"]["ref"],
        "head_branch": pr_data["head"]["ref"],
        "files": [
            {
                "filename": f["filename"],
                "status": f["status"],  # added, modified, removed
                "patch": f.get("patch", ""),
                "additions": f["additions"],
                "deletions": f["deletions"],
            }
            for f in files
            if f.get("patch")  # Skip binary files
        ]
    }

Analyzing Code Changes with Claude

Send each file's diff to Claude with structured instructions for what to look for:

import anthropic
import json

client = anthropic.Anthropic()

review_tool = {
    "name": "submit_review_comments",
    "description": "Submit code review comments for specific lines in the diff",
    "input_schema": {
        "type": "object",
        "properties": {
            "comments": {
                "type": "array",
                "items": {
                    "type": "object",
                    "properties": {
                        "file": {"type": "string", "description": "Filename"},
                        "line": {"type": "integer", "description": "Line number in the diff"},
                        "severity": {
                            "type": "string",
                            "enum": ["critical", "warning", "suggestion", "nitpick"]
                        },
                        "category": {
                            "type": "string",
                            "enum": ["bug", "security", "performance", "style", "logic", "documentation"]
                        },
                        "comment": {"type": "string", "description": "The review comment with explanation"},
                        "suggested_fix": {"type": "string", "description": "Suggested code replacement if applicable"}
                    },
                    "required": ["file", "line", "severity", "category", "comment"]
                }
            },
            "summary": {"type": "string", "description": "Overall review summary"}
        },
        "required": ["comments", "summary"]
    }
}

def review_file(filename: str, patch: str, pr_context: str) -> dict:
    """Review a single file's changes."""
    response = client.messages.create(
        model="claude-sonnet-4-20250514",
        max_tokens=4096,
        tools=[review_tool],
        tool_choice={"type": "tool", "name": "submit_review_comments"},
        system="""You are an expert code reviewer. Analyze the diff and provide specific,
actionable feedback. Focus on:
1. Bugs and logic errors (highest priority)
2. Security vulnerabilities (SQL injection, XSS, auth bypasses)
3. Performance issues (N+1 queries, missing indexes, memory leaks)
4. Error handling gaps (uncaught exceptions, missing validation)
5. Code style and readability issues (lowest priority)

Be specific — reference exact line numbers and explain WHY something is an issue,
not just WHAT the issue is. Only comment on changed lines (lines starting with +).
If the code looks good, say so with an empty comments array.""",
        messages=[{
            "role": "user",
            "content": f"PR Context: {pr_context}\n\nFile: {filename}\n\nDiff:\n{patch}"
        }]
    )

    for block in response.content:
        if block.type == "tool_use":
            return block.input
    return {"comments": [], "summary": "No issues found"}

The Complete Review Pipeline

Orchestrate the review across all changed files:

See AI Voice Agents Handle Real Calls

Book a free demo or calculate how much you can save with AI voice automation.

Book a Demo ROI Calculator

def review_pull_request(owner: str, repo: str, pr_number: int) -> dict:
    """Run a complete code review on a pull request."""
    pr_data = get_pr_diff(owner, repo, pr_number)

    pr_context = f"PR Title: {pr_data['title']}\nDescription: {pr_data['description']}"

    all_comments = []
    file_summaries = []

    for file_info in pr_data["files"]:
        if file_info["status"] == "removed":
            continue  # Skip deleted files

        print(f"Reviewing {file_info['filename']}...")
        review = review_file(
            file_info["filename"],
            file_info["patch"],
            pr_context,
        )

        for comment in review.get("comments", []):
            comment["file"] = file_info["filename"]
            all_comments.append(comment)

        file_summaries.append({
            "file": file_info["filename"],
            "summary": review.get("summary", ""),
        })

    # Sort by severity
    severity_order = {"critical": 0, "warning": 1, "suggestion": 2, "nitpick": 3}
    all_comments.sort(key=lambda c: severity_order.get(c["severity"], 99))

    return {
        "pr_number": pr_number,
        "total_comments": len(all_comments),
        "critical_count": sum(1 for c in all_comments if c["severity"] == "critical"),
        "comments": all_comments,
        "file_summaries": file_summaries,
    }

Posting Review Comments to GitHub

Post the agent's findings as a GitHub PR review:

def post_review_to_github(owner: str, repo: str, pr_number: int,
                           review_data: dict, commit_sha: str):
    """Post review comments to GitHub PR."""
    headers = {
        "Authorization": f"token {GITHUB_TOKEN}",
        "Accept": "application/vnd.github.v3+json",
    }

    # Build GitHub review comments
    gh_comments = []
    for comment in review_data["comments"]:
        severity_emoji = {
            "critical": "[CRITICAL]",
            "warning": "[WARNING]",
            "suggestion": "[SUGGESTION]",
            "nitpick": "[NITPICK]",
        }
        prefix = severity_emoji.get(comment["severity"], "")
        body = f"**{prefix} {comment['category'].upper()}**\n\n{comment['comment']}"

        if comment.get("suggested_fix"):
            body += f"\n\n**Suggested fix:**\n```suggestion\n{comment['suggested_fix']}\n```"

        gh_comments.append({
            "path": comment["file"],
            "line": comment["line"],
            "body": body,
        })

    # Determine review action based on findings
    if review_data["critical_count"] > 0:
        event = "REQUEST_CHANGES"
    elif review_data["total_comments"] > 0:
        event = "COMMENT"
    else:
        event = "APPROVE"

    # Create the review
    review_url = f"https://api.github.com/repos/{owner}/{repo}/pulls/{pr_number}/reviews"
    review_body = {
        "commit_id": commit_sha,
        "body": generate_review_summary(review_data),
        "event": event,
        "comments": gh_comments,
    }

    response = requests.post(review_url, headers=headers, json=review_body)
    return response.json()

def generate_review_summary(review_data: dict) -> str:
    critical = review_data["critical_count"]
    total = review_data["total_comments"]
    summary = f"## Automated Code Review\n\n"
    summary += f"Found **{total}** issues ({critical} critical).\n\n"

    for fs in review_data["file_summaries"]:
        summary += f"- **{fs['file']}**: {fs['summary']}\n"

    return summary

Running as a GitHub Action

Trigger the review agent on every PR:

# .github/workflows/code-review.yml
name: AI Code Review
on:
  pull_request:
    types: [opened, synchronize]

jobs:
  review:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with:
          python-version: "3.12"
      - run: pip install anthropic requests
      - run: python scripts/review_pr.py
        env:
          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
          PR_NUMBER: ${{ github.event.pull_request.number }}
          REPO_OWNER: ${{ github.repository_owner }}
          REPO_NAME: ${{ github.event.repository.name }}

FAQ

How do I prevent the agent from being too noisy with nitpick comments?

Add a severity filter in your review pipeline — only post comments with severity "critical" or "warning" by default. Store nitpicks separately for developers who want detailed feedback. You can also instruct Claude to limit total comments to the 10 most important findings, forcing it to prioritize.

Can the agent understand context beyond the diff?

Yes. You can fetch the full file content (not just the diff) from GitHub and include it in the prompt. This helps Claude understand the broader code context — what functions the changed code calls, what patterns the rest of the file follows, and whether the changes are consistent with existing style.

How much does it cost to review a typical PR?

A PR with 500 lines changed across 10 files typically uses 30,000-50,000 input tokens and 3,000-5,000 output tokens per file review. With Claude Sonnet, this costs roughly $0.50-$1.50 per PR. Using prompt caching for the system prompt reduces this by 20-30% for subsequent reviews. Batch processing non-urgent reviews saves an additional 50%.

#Claude #CodeReview #GitHub #PullRequests #Python #AgenticAI #LearnAI #AIEngineering

Building a Claude Code Review Agent: Automated PR Analysis and Suggestions

Why Automate Code Reviews

Fetching PR Diffs from GitHub

Analyzing Code Changes with Claude

The Complete Review Pipeline

Posting Review Comments to GitHub

Running as a GitHub Action

FAQ

How do I prevent the agent from being too noisy with nitpick comments?

Can the agent understand context beyond the diff?

How much does it cost to review a typical PR?

Try CallSphere AI Voice Agents

Related Articles

WebArena and Real-World Web Agent Benchmarks: How We Measure Browser Agent Performance

Taking Screenshots and Recording Videos with Playwright for AI Analysis

Playwright Selectors Deep Dive: CSS, XPath, Text, and Role-Based Element Finding