Building a Contract Review Agent: Clause Extraction, Risk Analysis, and Summary

Why Contract Review Needs AI Agents

Legal teams spend thousands of hours each year reviewing contracts manually. A single commercial agreement can run 40 to 80 pages, and spotting a problematic indemnification clause buried on page 57 requires both concentration and domain expertise. AI agents can automate the repetitive extraction work, flag high-risk language, and produce structured summaries — letting attorneys focus on judgment calls rather than reading marathons.

In this tutorial, you will build a contract review agent that parses documents, identifies key clauses, assigns risk scores, and generates an executive summary.

Architecture Overview

The agent pipeline has four stages:

Document Parsing — extract text from PDF or DOCX files
Clause Extraction — identify and classify contract sections
Risk Scoring — evaluate each clause against a risk rubric
Summary Generation — produce a structured executive summary

Step 1: Document Parsing

Before the LLM can analyze anything, you need clean text. We use pdfplumber for PDFs and python-docx for Word documents.

import pdfplumber
from docx import Document
from pathlib import Path


def extract_text(file_path: str) -> str:
    """Extract text from PDF or DOCX files."""
    path = Path(file_path)

    if path.suffix.lower() == ".pdf":
        with pdfplumber.open(path) as pdf:
            pages = [page.extract_text() or "" for page in pdf.pages]
            return "\n\n".join(pages)

    elif path.suffix.lower() == ".docx":
        doc = Document(path)
        return "\n\n".join([p.text for p in doc.paragraphs if p.text.strip()])

    else:
        raise ValueError(f"Unsupported file type: {path.suffix}")

Step 2: Clause Extraction with an LLM

Once you have plain text, the agent identifies standard contract clauses. We define a structured output schema using Pydantic so the LLM returns machine-readable results.

See AI Voice Agents Handle Real Calls

Book a free demo or calculate how much you can save with AI voice automation.

Book a Demo ROI Calculator

from pydantic import BaseModel
from openai import OpenAI

client = OpenAI()


class ContractClause(BaseModel):
    clause_type: str  # e.g., "Indemnification", "Termination"
    text: str
    section_number: str


class ClauseExtractionResult(BaseModel):
    clauses: list[ContractClause]
    parties: list[str]
    effective_date: str
    governing_law: str


def extract_clauses(contract_text: str) -> ClauseExtractionResult:
    """Use an LLM to identify and classify contract clauses."""
    response = client.beta.chat.completions.parse(
        model="gpt-4o",
        messages=[
            {
                "role": "system",
                "content": (
                    "You are a contract analysis expert. Extract all key "
                    "clauses from the contract, identify the parties, "
                    "effective date, and governing law."
                ),
            },
            {"role": "user", "content": contract_text},
        ],
        response_format=ClauseExtractionResult,
    )
    return response.choices[0].message.parsed

Step 3: Risk Scoring

Each extracted clause gets evaluated against a risk rubric. The agent checks for common red flags like unlimited liability, unilateral termination rights, or broad IP assignment.

class ClauseRisk(BaseModel):
    clause_type: str
    risk_level: str  # "low", "medium", "high", "critical"
    risk_score: int  # 1-10
    concerns: list[str]
    recommendation: str


class RiskReport(BaseModel):
    clause_risks: list[ClauseRisk]
    overall_risk_score: float
    top_concerns: list[str]


RISK_RUBRIC = """
Score each clause from 1 (minimal risk) to 10 (severe risk):
- Indemnification: flag unlimited/uncapped liability
- Termination: flag unilateral termination without cure period
- IP Assignment: flag broad or perpetual IP transfers
- Limitation of Liability: flag exclusion of consequential damages
- Confidentiality: flag indefinite obligations or overly broad scope
- Non-compete: flag excessive duration or geographic scope
"""


def score_risks(clauses: ClauseExtractionResult) -> RiskReport:
    """Score risk for each extracted clause."""
    clauses_text = "\n\n".join(
        f"[{c.clause_type}] {c.text}" for c in clauses.clauses
    )

    response = client.beta.chat.completions.parse(
        model="gpt-4o",
        messages=[
            {
                "role": "system",
                "content": f"You are a legal risk analyst.\n\n{RISK_RUBRIC}",
            },
            {"role": "user", "content": clauses_text},
        ],
        response_format=RiskReport,
    )
    return response.choices[0].message.parsed

Step 4: Executive Summary Generation

The final stage produces a human-readable summary combining extraction results and risk findings.

def generate_summary(
    clauses: ClauseExtractionResult, risks: RiskReport
) -> str:
    """Generate an executive summary of the contract review."""
    context = (
        f"Parties: {', '.join(clauses.parties)}\n"
        f"Effective Date: {clauses.effective_date}\n"
        f"Governing Law: {clauses.governing_law}\n"
        f"Overall Risk Score: {risks.overall_risk_score}/10\n"
        f"Top Concerns: {', '.join(risks.top_concerns)}\n\n"
        "Clause Details:\n"
    )
    for cr in risks.clause_risks:
        context += (
            f"- {cr.clause_type}: Risk {cr.risk_score}/10 "
            f"({cr.risk_level})\n"
        )

    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {
                "role": "system",
                "content": (
                    "Write a concise executive summary of this contract "
                    "review for a general counsel. Highlight the top risks "
                    "and recommended actions."
                ),
            },
            {"role": "user", "content": context},
        ],
    )
    return response.choices[0].message.content

Putting It All Together

def review_contract(file_path: str) -> dict:
    """Full contract review pipeline."""
    text = extract_text(file_path)
    clauses = extract_clauses(text)
    risks = score_risks(clauses)
    summary = generate_summary(clauses, risks)

    return {
        "parties": clauses.parties,
        "effective_date": clauses.effective_date,
        "governing_law": clauses.governing_law,
        "clauses_found": len(clauses.clauses),
        "overall_risk": risks.overall_risk_score,
        "top_concerns": risks.top_concerns,
        "summary": summary,
    }


result = review_contract("vendor_agreement.pdf")
print(f"Risk Score: {result['overall_risk']}/10")
print(f"Summary:\n{result['summary']}")

Production Considerations

When deploying a contract review agent, keep these points in mind. First, chunk long documents — contracts exceeding the context window should be split by section headers, processed individually, then merged. Second, add confidence scores — have the LLM output a confidence field for each extraction so reviewers know where to double-check. Third, maintain an audit trail — log every LLM call with the input hash, model version, and output for regulatory compliance.

FAQ

How accurate is LLM-based clause extraction compared to manual review?

Modern LLMs achieve 85-95% accuracy on standard clause identification in well-formatted contracts. However, accuracy drops on scanned documents with OCR artifacts or contracts with unusual formatting. Always pair AI extraction with human review for high-stakes agreements.

Can this agent handle contracts in languages other than English?

Yes. Models like GPT-4o support multilingual analysis. You would adjust the system prompt to specify the target language and update the risk rubric to reflect jurisdiction-specific legal standards.

How do you handle confidentiality when sending contracts to an LLM API?

Use API providers that offer data processing agreements and do not train on your data. For maximum security, deploy a self-hosted model like Llama or Mistral behind your firewall. Always redact personally identifiable information before sending documents to external APIs.

#ContractReview #LegalAI #NLP #RiskAnalysis #DocumentParsing #AgenticAI #LearnAI #AIEngineering