AI Agent for Document Generation: Contracts, Proposals, and Reports on Demand
Build an AI agent that generates professional documents like contracts, proposals, and reports by combining template engines, dynamic data injection, and PDF rendering with version tracking.
From Manual Documents to Automated Generation
Every business produces documents: contracts for new clients, proposals for deals, weekly reports for stakeholders, and invoices for accounting. These documents follow consistent templates but require unique data for each instance. A document generation agent combines template engines for structure, LLM reasoning for dynamic content, and PDF rendering for professional output.
This guide walks through building a complete document generation agent that accepts structured data, fills templates, generates custom sections with AI, renders PDFs, and tracks versions.
Defining Document Templates
We use Jinja2 as the template engine. Each template is an HTML file with placeholders for dynamic data. HTML-to-PDF conversion produces professional output with CSS styling:
from jinja2 import Environment, FileSystemLoader
from dataclasses import dataclass, field
from datetime import datetime
from typing import Any
@dataclass
class DocumentTemplate:
name: str
template_file: str
required_fields: list[str]
ai_sections: list[str] = field(default_factory=list)
TEMPLATES = {
"contract": DocumentTemplate(
name="Service Agreement",
template_file="contract.html",
required_fields=["client_name", "client_address", "service_description",
"start_date", "end_date", "total_amount"],
ai_sections=["scope_of_work", "termination_clause"],
),
"proposal": DocumentTemplate(
name="Business Proposal",
template_file="proposal.html",
required_fields=["prospect_name", "company", "problem_statement",
"budget_range"],
ai_sections=["executive_summary", "proposed_solution", "timeline"],
),
"report": DocumentTemplate(
name="Weekly Report",
template_file="report.html",
required_fields=["team_name", "week_start", "metrics", "highlights"],
ai_sections=["analysis", "recommendations"],
),
}
env = Environment(loader=FileSystemLoader("templates"))
Each template declares which fields are required from the user and which sections should be generated by the AI. This separation keeps humans in control of factual data while delegating narrative writing to the LLM.
Generating AI-Powered Sections
The agent generates document sections based on the structured data and the document type. Each section gets a targeted prompt:
See AI Voice Agents Handle Real Calls
Book a free demo or calculate how much you can save with AI voice automation.
from openai import OpenAI
client = OpenAI()
SECTION_PROMPTS = {
"executive_summary": (
"Write a concise executive summary for a business proposal. "
"Focus on the client's problem and why our solution is the best fit. "
"Keep it under 150 words. Use a professional but approachable tone."
),
"proposed_solution": (
"Describe the proposed solution in detail. Include methodology, "
"deliverables, and key differentiators. Use bullet points for clarity."
),
"scope_of_work": (
"Write a clear scope of work clause for a service agreement. "
"Be specific about what is included and what is excluded."
),
"termination_clause": (
"Write a standard termination clause. Include notice period, "
"grounds for termination, and obligations upon termination."
),
"analysis": (
"Analyze the metrics and highlights provided. Identify trends, "
"areas of concern, and positive developments."
),
"recommendations": (
"Based on the analysis, provide 3-5 actionable recommendations "
"for the next week. Be specific and prioritized."
),
"timeline": (
"Create a realistic project timeline with milestones. "
"Include discovery, implementation, testing, and launch phases."
),
}
def generate_section(section_name: str, context: dict[str, Any]) -> str:
"""Generate a document section using an LLM."""
prompt = SECTION_PROMPTS.get(section_name, f"Write the {section_name} section.")
context_str = "\n".join(f"{k}: {v}" for k, v in context.items())
response = client.chat.completions.create(
model="gpt-4o",
temperature=0.4,
messages=[
{"role": "system", "content": prompt},
{"role": "user", "content": f"Document context:\n{context_str}"},
],
)
return response.choices[0].message.content
Building the Document Assembly Pipeline
The assembly pipeline validates input data, generates AI sections, renders the template, and produces a PDF:
import hashlib
import json
@dataclass
class GeneratedDocument:
template_name: str
html_content: str
data: dict[str, Any]
version_hash: str
created_at: str
def assemble_document(template_key: str, data: dict[str, Any]) -> GeneratedDocument:
"""Assemble a complete document from template and data."""
template_def = TEMPLATES[template_key]
# Validate required fields
missing = [f for f in template_def.required_fields if f not in data]
if missing:
raise ValueError(f"Missing required fields: {missing}")
# Generate AI sections
for section in template_def.ai_sections:
if section not in data:
data[section] = generate_section(section, data)
# Render HTML template
template = env.get_template(template_def.template_file)
html = template.render(**data, generated_date=datetime.now().strftime("%B %d, %Y"))
# Compute version hash for tracking
content_hash = hashlib.sha256(json.dumps(data, sort_keys=True).encode()).hexdigest()[:12]
return GeneratedDocument(
template_name=template_def.name,
html_content=html,
data=data,
version_hash=content_hash,
created_at=datetime.now().isoformat(),
)
Rendering PDFs with WeasyPrint
WeasyPrint converts HTML with CSS directly to PDF. It handles page breaks, headers, footers, and professional typography:
from weasyprint import HTML
from pathlib import Path
def render_pdf(document: GeneratedDocument, output_dir: str = "output") -> str:
"""Render an assembled document to PDF."""
Path(output_dir).mkdir(exist_ok=True)
filename = (
f"{document.template_name.replace(' ', '_').lower()}"
f"_{document.version_hash}.pdf"
)
filepath = Path(output_dir) / filename
HTML(string=document.html_content).write_pdf(str(filepath))
return str(filepath)
Version Tracking and Storage
Every generated document is tracked with its input data, version hash, and metadata. This enables auditing and regeneration:
import sqlite3
def init_db(db_path: str = "documents.db"):
conn = sqlite3.connect(db_path)
conn.execute("""
CREATE TABLE IF NOT EXISTS documents (
id INTEGER PRIMARY KEY AUTOINCREMENT,
template_name TEXT NOT NULL,
version_hash TEXT NOT NULL,
input_data TEXT NOT NULL,
pdf_path TEXT,
created_at TEXT NOT NULL
)
""")
conn.commit()
return conn
def save_document_record(conn: sqlite3.Connection, doc: GeneratedDocument, pdf_path: str):
conn.execute(
"INSERT INTO documents (template_name, version_hash, input_data, pdf_path, created_at) "
"VALUES (?, ?, ?, ?, ?)",
(doc.template_name, doc.version_hash, json.dumps(doc.data), pdf_path, doc.created_at),
)
conn.commit()
FAQ
How do I ensure AI-generated legal clauses are accurate?
Never deploy AI-generated legal text without lawyer review. Use the AI to generate first drafts based on your approved clause library, then flag all AI-generated sections for human review. Store approved clause variants as few-shot examples in your prompts to improve consistency.
Can I add custom branding like logos and company colors?
Yes. The HTML templates support full CSS including custom fonts, colors, and embedded images. Use base64-encoded images in the template or reference files in the templates directory. WeasyPrint handles CSS print media queries for page-specific styling.
How do I handle document revisions and track changes?
Store each version with its input data and version hash. To show changes between versions, diff the rendered HTML or the input data dictionaries. The version hash changes whenever any input field changes, making it easy to detect modifications.
#DocumentGeneration #AIAgents #PDFGeneration #TemplateEngine #WorkflowAutomation #Python #AgenticAI #LearnAI #AIEngineering
CallSphere Team
Expert insights on AI voice agents and customer communication automation.
Try CallSphere AI Voice Agents
See how AI voice agents work for your industry. Live demo available -- no signup required.