Skip to content
Learn Agentic AI13 min read0 views

Building Your First AI Agent from Scratch in Python (No Framework)

Build a complete, working AI agent from scratch using only the OpenAI API and Python standard library — with a tool loop, conversation management, structured tool calling, and error handling.

Why Build Without a Framework?

Frameworks like LangChain, CrewAI, and the OpenAI Agents SDK are excellent tools — but if you use them without understanding what they do underneath, you are building on a foundation you cannot debug, optimize, or extend. Building an agent from scratch once teaches you the mechanics that every framework abstracts away.

This tutorial builds a fully functional agent with tools, conversation management, and error handling using only the OpenAI Python SDK and the standard library. No LangChain, no Agents SDK, no abstractions. Just the raw loop.

What We Are Building

A personal assistant agent that can:

  • Look up the current time in any timezone
  • Do math calculations
  • Read and summarize web page content (simulated)
  • Maintain a conversation across multiple turns

Step 1: Project Setup

Create a single file, agent.py. No dependencies beyond the OpenAI SDK:

# agent.py
import json
import math
from datetime import datetime, timezone, timedelta
from openai import OpenAI

client = OpenAI()  # Uses OPENAI_API_KEY env variable
MODEL = "gpt-4o"

Step 2: Define Tools as Python Functions

Each tool is a regular Python function. We will also create the tool definitions that the OpenAI API requires.

def get_current_time(timezone_offset: int = 0) -> dict:
    """Get the current time with an optional UTC offset."""
    tz = timezone(timedelta(hours=timezone_offset))
    now = datetime.now(tz)
    return {
        "time": now.strftime("%H:%M:%S"),
        "date": now.strftime("%Y-%m-%d"),
        "timezone": f"UTC{'+' if timezone_offset >= 0 else ''}{timezone_offset}",
    }

def calculate(expression: str) -> dict:
    """Safely evaluate a mathematical expression."""
    # Whitelist allowed names for safety
    allowed_names = {
        "abs": abs, "round": round, "min": min, "max": max,
        "sum": sum, "pow": pow, "sqrt": math.sqrt,
        "pi": math.pi, "e": math.e, "log": math.log,
        "sin": math.sin, "cos": math.cos, "tan": math.tan,
    }
    try:
        result = eval(expression, {"__builtins__": {}}, allowed_names)
        return {"expression": expression, "result": result}
    except Exception as e:
        return {"error": f"Cannot evaluate '{expression}': {str(e)}"}

def read_webpage(url: str) -> dict:
    """Simulate reading a webpage. Replace with real HTTP calls in production."""
    # In a real agent, you would use httpx or requests here
    return {
        "url": url,
        "title": f"Page at {url}",
        "content": f"This is simulated content from {url}. "
                   f"In production, use an HTTP client to fetch real content.",
        "status": "simulated",
    }

Step 3: Create Tool Definitions for the API

The OpenAI API needs JSON Schema definitions for each tool. This is the contract that tells the LLM what tools exist and how to call them.

TOOL_DEFINITIONS = [
    {
        "type": "function",
        "function": {
            "name": "get_current_time",
            "description": (
                "Get the current date and time. Optionally specify a timezone "
                "as a UTC offset (e.g., -5 for EST, +9 for JST)."
            ),
            "parameters": {
                "type": "object",
                "properties": {
                    "timezone_offset": {
                        "type": "integer",
                        "description": "UTC offset in hours. Default is 0 (UTC).",
                    },
                },
            },
        },
    },
    {
        "type": "function",
        "function": {
            "name": "calculate",
            "description": (
                "Evaluate a mathematical expression. Supports basic arithmetic "
                "(+, -, *, /, **), functions (sqrt, log, sin, cos, tan), "
                "and constants (pi, e). Example: 'sqrt(144) + pi'"
            ),
            "parameters": {
                "type": "object",
                "properties": {
                    "expression": {
                        "type": "string",
                        "description": "The math expression to evaluate.",
                    },
                },
                "required": ["expression"],
            },
        },
    },
    {
        "type": "function",
        "function": {
            "name": "read_webpage",
            "description": "Read and return the content of a webpage given its URL.",
            "parameters": {
                "type": "object",
                "properties": {
                    "url": {
                        "type": "string",
                        "description": "The full URL to read.",
                    },
                },
                "required": ["url"],
            },
        },
    },
]

# Map tool names to their Python functions
TOOL_MAP = {
    "get_current_time": get_current_time,
    "calculate": calculate,
    "read_webpage": read_webpage,
}

Step 4: Build the Tool Executor

This function takes a tool call from the API and executes the corresponding Python function:

See AI Voice Agents Handle Real Calls

Book a free demo or calculate how much you can save with AI voice automation.

def execute_tool_call(tool_call) -> str:
    """Execute a single tool call and return the result as a JSON string."""
    function_name = tool_call.function.name

    # Parse arguments
    try:
        arguments = json.loads(tool_call.function.arguments)
    except json.JSONDecodeError:
        return json.dumps({"error": f"Invalid arguments: {tool_call.function.arguments}"})

    # Look up and execute the function
    func = TOOL_MAP.get(function_name)
    if not func:
        return json.dumps({"error": f"Unknown tool: {function_name}"})

    try:
        result = func(**arguments)
        return json.dumps(result, default=str)
    except TypeError as e:
        return json.dumps({"error": f"Wrong arguments for {function_name}: {str(e)}"})
    except Exception as e:
        return json.dumps({"error": f"{function_name} failed: {str(e)}"})

Step 5: The Agent Loop

This is the core. The agent loop sends messages to the API, checks if the response contains tool calls, executes them, and continues until the model returns a plain text response.

def agent_turn(messages: list, max_iterations: int = 10) -> str:
    """Run the agent loop for a single user turn. Returns the final text response."""

    for iteration in range(max_iterations):
        try:
            response = client.chat.completions.create(
                model=MODEL,
                messages=messages,
                tools=TOOL_DEFINITIONS,
            )
        except Exception as e:
            return f"API error: {str(e)}"

        choice = response.choices[0]
        assistant_message = choice.message

        # Add the assistant's message to history
        messages.append(assistant_message)

        # Check if the model wants to call tools
        if not assistant_message.tool_calls:
            # No tool calls — this is the final response
            return assistant_message.content or "(No response generated)"

        # Execute each tool call
        print(f"  [Step {iteration + 1}] Calling {len(assistant_message.tool_calls)} tool(s)...")

        for tool_call in assistant_message.tool_calls:
            print(f"    → {tool_call.function.name}({tool_call.function.arguments})")

            result = execute_tool_call(tool_call)

            print(f"    ← {result[:100]}{'...' if len(result) > 100 else ''}")

            # Add the tool result to the conversation
            messages.append({
                "role": "tool",
                "tool_call_id": tool_call.id,
                "content": result,
            })

    return "I was not able to complete the task within the step limit. Please try a simpler request."

Step 6: Conversation Manager

The conversation manager maintains state across multiple user turns and handles the system prompt:

class Conversation:
    """Manages a multi-turn conversation with an agent."""

    def __init__(self, system_prompt: str = None):
        self.messages = []
        if system_prompt:
            self.messages.append({"role": "system", "content": system_prompt})

    def chat(self, user_input: str) -> str:
        """Send a message and get the agent's response."""
        self.messages.append({"role": "user", "content": user_input})
        response = agent_turn(self.messages)
        return response

    def get_history(self) -> list[dict]:
        """Return conversation history (for debugging)."""
        return [
            {
                "role": m["role"] if isinstance(m, dict) else m.role,
                "content": (m.get("content", "") if isinstance(m, dict)
                           else m.content or "[tool call]"),
            }
            for m in self.messages
        ]

Step 7: Put It All Together

def main():
    """Interactive agent REPL."""
    print("AI Agent (type 'quit' to exit, 'history' to see conversation)")
    print("-" * 50)

    conversation = Conversation(
        system_prompt=(
            "You are a helpful personal assistant. You can check the current time "
            "in any timezone, perform calculations, and read webpages. "
            "Be concise and helpful. When using tools, explain what you found."
        )
    )

    while True:
        user_input = input("\nYou: ").strip()

        if not user_input:
            continue
        if user_input.lower() == "quit":
            print("Goodbye!")
            break
        if user_input.lower() == "history":
            for msg in conversation.get_history():
                print(f"  [{msg['role']}] {msg['content'][:80]}")
            continue

        response = conversation.chat(user_input)
        print(f"\nAgent: {response}")

if __name__ == "__main__":
    main()

Run it:

python agent.py

Try conversations like:

You: What time is it in Tokyo?
Agent: [calls get_current_time with offset +9]

You: What is the square root of that hour number times pi?
Agent: [calls calculate — uses context from previous turn]

You: Read https://example.com and summarize it
Agent: [calls read_webpage, then summarizes]

What You Just Built

This 150-line agent has every fundamental capability of a production agent. It has a tool loop that continues until the task is done. It has proper error handling at every layer — argument parsing, tool execution, API failures. It maintains conversation state across turns. And it is completely transparent — you can see every tool call and result.

Frameworks add value on top of this foundation: streaming, tracing, type safety, multi-agent handoffs, guardrails. But every framework is ultimately running this same loop underneath. Now that you have built it from scratch, you will understand exactly what those frameworks are doing — and you will know how to debug them when they do not behave as expected.

FAQ

How would I add a new tool to this agent?

Three steps: write a Python function that implements the tool, add its JSON Schema definition to TOOL_DEFINITIONS, and register it in TOOL_MAP. That is it. The agent loop and tool executor handle the rest automatically because they are generic — they work with any tool that follows the function calling contract.

How do I handle tools that take a long time to execute?

Wrap tool execution in an async pattern with a timeout. Use Python's asyncio.wait_for() with a reasonable timeout (10-30 seconds). If a tool times out, return an error message to the LLM so it can try a different approach. For very long operations (minutes), consider a polling pattern where the tool returns a job ID and a separate check_job_status tool lets the agent poll for completion.

Should I use this raw approach or a framework in production?

Use a framework for production. This from-scratch approach is for learning. Frameworks like the OpenAI Agents SDK add streaming (critical for user experience), tracing (critical for debugging), guardrails (critical for safety), and structured outputs (critical for reliability). The value of building from scratch is that you understand what the framework does, so you can configure it correctly and debug it when needed.


#AIAgent #Python #OpenAIAPI #Tutorial #FromScratch #AgenticAI #LearnAI #AIEngineering

Share this article
C

CallSphere Team

Expert insights on AI voice agents and customer communication automation.

Try CallSphere AI Voice Agents

See how AI voice agents work for your industry. Live demo available -- no signup required.