Building a Text-to-SQL Agent with GPT-4: Schema-Aware Query Generation
Build a complete text-to-SQL agent using GPT-4 that extracts database schemas, generates SQL queries from natural language, executes them safely, and formats results for end users.
Why Build an Agent, Not Just a Prompt?
A simple text-to-SQL prompt generates a query and stops. An agent goes further: it extracts the schema automatically, generates the query, executes it, handles errors, retries with corrections, and formats the final answer. This agentic loop is what makes the difference between a demo and a production tool.
In this tutorial, you will build a complete text-to-SQL agent using GPT-4 that works against any PostgreSQL database.
Step 1: Schema Extraction
The first component extracts a structured schema representation from your database. This gives the LLM the context it needs to write accurate queries.
import psycopg2
from dataclasses import dataclass
@dataclass
class ColumnInfo:
name: str
data_type: str
is_nullable: bool
is_primary_key: bool
@dataclass
class TableInfo:
name: str
columns: list[ColumnInfo]
foreign_keys: list[str]
def extract_schema(conn_string: str) -> list[TableInfo]:
"""Extract full schema metadata from a PostgreSQL database."""
conn = psycopg2.connect(conn_string)
cur = conn.cursor()
# Get all user tables
cur.execute("""
SELECT table_name FROM information_schema.tables
WHERE table_schema = 'public' AND table_type = 'BASE TABLE'
""")
tables = []
for (table_name,) in cur.fetchall():
# Get columns
cur.execute("""
SELECT column_name, data_type, is_nullable
FROM information_schema.columns
WHERE table_schema = 'public' AND table_name = %s
ORDER BY ordinal_position
""", (table_name,))
columns = [
ColumnInfo(name=row[0], data_type=row[1],
is_nullable=row[2] == "YES", is_primary_key=False)
for row in cur.fetchall()
]
# Get primary keys
cur.execute("""
SELECT kcu.column_name
FROM information_schema.table_constraints tc
JOIN information_schema.key_column_usage kcu
ON tc.constraint_name = kcu.constraint_name
WHERE tc.table_name = %s AND tc.constraint_type = 'PRIMARY KEY'
""", (table_name,))
pk_columns = {row[0] for row in cur.fetchall()}
for col in columns:
col.is_primary_key = col.name in pk_columns
# Get foreign keys
cur.execute("""
SELECT kcu.column_name, ccu.table_name, ccu.column_name
FROM information_schema.table_constraints tc
JOIN information_schema.key_column_usage kcu
ON tc.constraint_name = kcu.constraint_name
JOIN information_schema.constraint_column_usage ccu
ON tc.constraint_name = ccu.constraint_name
WHERE tc.table_name = %s AND tc.constraint_type = 'FOREIGN KEY'
""", (table_name,))
fks = [f"{r[0]} -> {r[1]}.{r[2]}" for r in cur.fetchall()]
tables.append(TableInfo(name=table_name, columns=columns, foreign_keys=fks))
conn.close()
return tables
Step 2: Schema Formatting for the Prompt
The schema needs to be formatted in a way the LLM can parse efficiently. CREATE TABLE syntax works best because LLMs have seen millions of examples during training.
def format_schema(tables: list[TableInfo]) -> str:
"""Format schema as CREATE TABLE statements."""
output = []
for table in tables:
cols = []
for c in table.columns:
parts = [f" {c.name} {c.data_type}"]
if c.is_primary_key:
parts.append("PRIMARY KEY")
if not c.is_nullable:
parts.append("NOT NULL")
cols.append(" ".join(parts))
create = f"CREATE TABLE {table.name} (\n"
create += ",\n".join(cols)
if table.foreign_keys:
for fk in table.foreign_keys:
col, ref = fk.split(" -> ")
create += f",\n FOREIGN KEY ({col}) REFERENCES {ref}"
create += "\n);"
output.append(create)
return "\n\n".join(output)
Step 3: The Query Generation Agent
Now combine schema extraction with GPT-4 to create a full agent that generates, executes, and retries queries.
See AI Voice Agents Handle Real Calls
Book a free demo or calculate how much you can save with AI voice automation.
import openai
import json
class TextToSQLAgent:
def __init__(self, conn_string: str):
self.conn_string = conn_string
self.client = openai.OpenAI()
self.schema = format_schema(extract_schema(conn_string))
self.max_retries = 3
def generate_sql(self, question: str, error_context: str = "") -> str:
messages = [
{
"role": "system",
"content": f"""You are a PostgreSQL expert. Convert the user's
question into a valid SQL query using this schema:
{self.schema}
Rules:
- Return ONLY the SQL query
- Use PostgreSQL syntax
- Always use table aliases for JOINs
- Limit results to 100 rows unless specified otherwise
{f"Previous error: {error_context}" if error_context else ""}""",
},
{"role": "user", "content": question},
]
response = self.client.chat.completions.create(
model="gpt-4o", messages=messages, temperature=0
)
sql = response.choices[0].message.content.strip()
# Strip markdown code fences if present
if sql.startswith("~~~"):
sql = sql.split("\n", 1)[1].rsplit("~~~", 1)[0].strip()
return sql
def execute(self, sql: str) -> list[dict]:
conn = psycopg2.connect(self.conn_string)
cur = conn.cursor()
cur.execute(sql)
columns = [desc[0] for desc in cur.description]
rows = [dict(zip(columns, row)) for row in cur.fetchall()]
conn.close()
return rows
def ask(self, question: str) -> dict:
error_context = ""
for attempt in range(self.max_retries):
sql = self.generate_sql(question, error_context)
try:
results = self.execute(sql)
return {"sql": sql, "results": results, "attempts": attempt + 1}
except Exception as e:
error_context = f"Query: {sql}\nError: {str(e)}"
return {"sql": sql, "error": error_context, "attempts": self.max_retries}
Step 4: Using the Agent
agent = TextToSQLAgent("postgresql://user:pass@localhost/sales_db")
answer = agent.ask("Which salesperson had the highest revenue last quarter?")
print(f"SQL: {answer['sql']}")
print(f"Results: {json.dumps(answer['results'], indent=2)}")
print(f"Resolved in {answer['attempts']} attempt(s)")
The retry mechanism is critical. In practice, roughly 10-15% of first-attempt queries contain minor errors that the LLM can self-correct when given the error message as context.
FAQ
Why use CREATE TABLE format instead of JSON for schema context?
LLMs have been trained on vastly more SQL DDL than structured JSON schema descriptions. Using CREATE TABLE statements consistently produces higher accuracy because the model can directly pattern-match against its training data. Benchmarks show 3-5% accuracy improvement with DDL format.
How do I handle very large schemas with hundreds of tables?
For large schemas, use a two-stage approach: first ask the LLM to identify which tables are relevant to the question, then provide only those tables in the generation prompt. This keeps the context window manageable and improves accuracy by reducing noise.
Should I use GPT-4 or GPT-4o for text-to-SQL?
GPT-4o is recommended for most use cases. It offers comparable SQL generation accuracy to GPT-4 at significantly lower cost and latency. For extremely complex queries involving multiple CTEs or window functions, GPT-4 may produce slightly better results, but the difference is usually within 2-3%.
#TextToSQL #GPT4 #SQLAgent #SchemaExtraction #OpenAI #AgenticAI #DatabaseAutomation #PythonSQL
CallSphere Team
Expert insights on AI voice agents and customer communication automation.
Try CallSphere AI Voice Agents
See how AI voice agents work for your industry. Live demo available -- no signup required.