Skip to content
Learn Agentic AI11 min read0 views

Building a Legal Reasoning Agent: Multi-Step Argument Construction with Evidence

Build an AI agent that performs structured legal reasoning — searching precedents, constructing multi-step arguments with evidence chains, generating counter-arguments, and producing balanced legal analysis.

Legal reasoning is fundamentally different from factual Q&A. A lawyer does not just retrieve facts — they construct arguments. Each argument has a claim, supporting evidence, a legal basis (statutes or precedent), and must withstand counter-arguments. This multi-step, adversarial structure makes legal reasoning an excellent test case for advanced agent architectures.

This tutorial builds a legal reasoning agent that can analyze a legal question, search for relevant precedents, construct structured arguments, and generate counter-arguments — all while maintaining proper evidence chains.

The Argument Data Model

Legal arguments have a recursive structure: claims are supported by evidence, which may themselves be claims requiring further support.

from pydantic import BaseModel
from enum import Enum

class EvidenceType(str, Enum):
    STATUTE = "statute"
    CASE_LAW = "case_law"
    REGULATION = "regulation"
    EXPERT_OPINION = "expert_opinion"
    FACTUAL = "factual"

class Evidence(BaseModel):
    source: str
    content: str
    evidence_type: EvidenceType
    relevance_score: float  # 0.0 to 1.0
    citation: str

class LegalArgument(BaseModel):
    claim: str
    supporting_evidence: list[Evidence]
    reasoning_chain: list[str]  # step-by-step logic
    strength: float  # 0.0 to 1.0
    counter_arguments: list["LegalArgument"] = []

class LegalAnalysis(BaseModel):
    question: str
    arguments_for: list[LegalArgument]
    arguments_against: list[LegalArgument]
    conclusion: str
    confidence: float

The agent needs a way to find relevant legal precedents. In production this would hit a legal database API (Westlaw, LexisNexis). Here we simulate it with a structured retrieval pattern:

from openai import OpenAI
import json

client = OpenAI()

def search_precedents(legal_issue: str, jurisdiction: str = "US Federal") -> list[Evidence]:
    """Search for relevant legal precedents."""
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {"role": "system", "content": (
                "You are a legal research assistant. Given a legal issue, "
                "identify the most relevant cases, statutes, and regulations. "
                "For each, provide the citation, key holding, and relevance. "
                "Return JSON array of evidence objects."
            )},
            {"role": "user", "content": (
                f"Legal issue: {legal_issue}\n"
                f"Jurisdiction: {jurisdiction}\n"
                "Find 3-5 most relevant precedents."
            )},
        ],
        response_format={"type": "json_object"},
    )
    data = json.loads(response.choices[0].message.content)
    return [Evidence(**e) for e in data.get("evidence", [])]

Multi-Step Argument Construction

The argument builder works in three phases: (1) identify possible claims, (2) gather evidence for each, (3) construct the reasoning chain connecting evidence to claim.

See AI Voice Agents Handle Real Calls

Book a free demo or calculate how much you can save with AI voice automation.

def construct_argument(
    claim: str,
    evidence: list[Evidence],
    legal_question: str,
) -> LegalArgument:
    """Build a structured legal argument from claim and evidence."""
    evidence_summary = "\n".join(
        f"[{e.evidence_type.value}] {e.citation}: {e.content}"
        for e in evidence
    )

    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {"role": "system", "content": """You are a legal reasoning agent.
Construct a rigorous legal argument by:
1. Stating the claim clearly
2. Building a step-by-step reasoning chain from evidence to claim
3. Each step must cite specific evidence
4. Assess the overall strength of the argument (0.0-1.0)
5. Identify the weakest link in the reasoning chain

Return JSON with: reasoning_chain (list of steps), strength (float)."""},
            {"role": "user", "content": (
                f"Legal question: {legal_question}\n"
                f"Claim to support: {claim}\n"
                f"Available evidence:\n{evidence_summary}"
            )},
        ],
        response_format={"type": "json_object"},
    )
    data = json.loads(response.choices[0].message.content)
    return LegalArgument(
        claim=claim,
        supporting_evidence=evidence,
        reasoning_chain=data["reasoning_chain"],
        strength=data["strength"],
    )

Counter-Argument Generation

A good legal analysis must address opposing views. The counter-argument generator takes an existing argument and attacks it:

def generate_counter_arguments(
    argument: LegalArgument,
    legal_question: str,
) -> list[LegalArgument]:
    """Generate counter-arguments that challenge the given argument."""
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {"role": "system", "content": """You are an opposing counsel.
Your job is to find flaws in the given argument and construct counter-arguments.
Attack strategies:
- Distinguish cited cases on facts
- Challenge the reasoning chain logic
- Cite conflicting precedent
- Argue policy implications
Return 2-3 counter-arguments as JSON."""},
            {"role": "user", "content": (
                f"Question: {legal_question}\n"
                f"Argument to counter:\n"
                f"Claim: {argument.claim}\n"
                f"Reasoning: {argument.reasoning_chain}"
            )},
        ],
        response_format={"type": "json_object"},
    )
    data = json.loads(response.choices[0].message.content)
    counters = []
    for c in data.get("counter_arguments", []):
        counters.append(LegalArgument(
            claim=c["claim"],
            supporting_evidence=[],
            reasoning_chain=c["reasoning_chain"],
            strength=c["strength"],
        ))
    return counters

The Full Analysis Pipeline

def analyze_legal_question(question: str) -> LegalAnalysis:
    # 1. Search for relevant precedents
    evidence = search_precedents(question)

    # 2. Identify claims for and against
    claims = identify_claims(question, evidence)

    # 3. Construct arguments for each side
    args_for = [construct_argument(c, evidence, question) for c in claims["for"]]
    args_against = [construct_argument(c, evidence, question) for c in claims["against"]]

    # 4. Generate counter-arguments
    for arg in args_for:
        arg.counter_arguments = generate_counter_arguments(arg, question)

    # 5. Synthesize conclusion
    conclusion = synthesize_conclusion(question, args_for, args_against)

    return LegalAnalysis(
        question=question,
        arguments_for=args_for,
        arguments_against=args_against,
        conclusion=conclusion,
        confidence=0.7,
    )

Important Disclaimers

This agent is a reasoning tool, not a replacement for licensed attorneys. It cannot guarantee legal accuracy, may miss jurisdiction-specific nuances, and should never be the sole basis for legal decisions.

FAQ

How do you ensure the agent cites real cases?

In production, connect the precedent search to a real legal database API. When using LLM-generated citations, always flag them as "AI-generated — verify before citing" and implement a validation step against a case law database.

Can this handle multiple jurisdictions?

Yes, by parameterizing the precedent search with jurisdiction and instructing the reasoning agent to consider jurisdictional differences. Multi-jurisdiction analysis requires separate evidence gathering for each jurisdiction and explicit conflict-of-law analysis.

How do you evaluate argument quality?

Use a separate evaluator agent that scores arguments on: logical validity (does the conclusion follow from the premises?), evidence quality (are sources authoritative and relevant?), and completeness (are there obvious gaps in the reasoning chain?).


#LegalAI #LegalReasoning #ArgumentConstruction #EvidenceChains #AgenticAI #PythonAI #AIForLaw #ReasoningAgents

Share this article
C

CallSphere Team

Expert insights on AI voice agents and customer communication automation.

Try CallSphere AI Voice Agents

See how AI voice agents work for your industry. Live demo available -- no signup required.