What Is Tool Use?

Tool use (also called function calling) is the mechanism by which an LLM can invoke external functions during a conversation. Instead of generating text alone, the model outputs a structured request to call a specific tool with specific arguments. The application executes the tool and returns the result, which the model then uses to continue its response.

This capability transforms LLMs from pure text generators into agents that can interact with the real world: querying databases, calling APIs, reading files, performing calculations, and executing code.

How Tool Use Works Internally

The Conversation Flow

1. User sends a message + tool definitions
2. Model decides whether to use a tool (or respond directly)
3. If tool use: Model outputs a tool_use block with name + arguments
4. Application executes the tool and sends back tool_result
5. Model incorporates the result and continues
6. Steps 2-5 repeat until the model responds directly (end_turn)

Implementation with Claude

import anthropic

client = anthropic.Anthropic()

# Define tools
tools = [
    {
        "name": "get_weather",
        "description": "Get the current weather for a given city. "
                       "Returns temperature, conditions, and humidity.",
        "input_schema": {
            "type": "object",
            "properties": {
                "city": {
                    "type": "string",
                    "description": "City name (e.g., 'San Francisco, CA')"
                },
                "units": {
                    "type": "string",
                    "enum": ["celsius", "fahrenheit"],
                    "description": "Temperature units",
                    "default": "fahrenheit"
                }
            },
            "required": ["city"]
        }
    },
    {
        "name": "search_products",
        "description": "Search the product catalog by keyword. "
                       "Returns matching products with prices.",
        "input_schema": {
            "type": "object",
            "properties": {
                "query": {"type": "string", "description": "Search query"},
                "category": {"type": "string", "description": "Product category filter"},
                "max_price": {"type": "number", "description": "Maximum price filter"},
                "limit": {"type": "integer", "default": 5, "maximum": 20}
            },
            "required": ["query"]
        }
    }
]

# Tool execution handlers
async def execute_tool(name: str, args: dict) -> str:
    if name == "get_weather":
        return await weather_api.get(args["city"], args.get("units", "fahrenheit"))
    elif name == "search_products":
        return await product_db.search(**args)
    else:
        return f"Unknown tool: {name}"

# The agent loop
async def agent_loop(user_message: str, max_turns: int = 10) -> str:
    messages = [{"role": "user", "content": user_message}]

    for turn in range(max_turns):
        response = client.messages.create(
            model="claude-sonnet-4-20250514",
            max_tokens=4096,
            tools=tools,
            messages=messages,
        )

        # Check if model wants to use tools
        if response.stop_reason == "end_turn":
            # Model is done -- extract text response
            return next(b.text for b in response.content if b.type == "text")

        # Process tool calls
        messages.append({"role": "assistant", "content": response.content})

        tool_results = []
        for block in response.content:
            if block.type == "tool_use":
                result = await execute_tool(block.name, block.input)
                tool_results.append({
                    "type": "tool_result",
                    "tool_use_id": block.id,
                    "content": str(result),
                })

        messages.append({"role": "user", "content": tool_results})

    return "Agent exceeded maximum turns"

Tool Design Principles

1. Clear, Specific Descriptions

The tool description is the most important factor in whether the model uses the tool correctly. Be specific about what the tool does, what it returns, and when to use it:

# BAD: Vague description
{
    "name": "search",
    "description": "Search for stuff"
}

# GOOD: Specific description
{
    "name": "search_knowledge_base",
    "description": "Search the company knowledge base for articles, "
                   "FAQs, and documentation. Returns up to 5 matching "
                   "articles with titles, snippets, and URLs. Use this "
                   "when the user asks about company policies, product "
                   "features, or troubleshooting steps. Do NOT use for "
                   "general knowledge questions."
}

2. Constrained Input Schemas

Use enums, min/max values, and required fields to constrain what the model can pass:

{
    "name": "create_ticket",
    "description": "Create a support ticket in the ticketing system",
    "input_schema": {
        "type": "object",
        "properties": {
            "title": {
                "type": "string",
                "maxLength": 200,
                "description": "Brief title describing the issue"
            },
            "priority": {
                "type": "string",
                "enum": ["low", "medium", "high", "critical"],
                "description": "Ticket priority. Use 'critical' only for "
                               "production outages affecting multiple users."
            },
            "category": {
                "type": "string",
                "enum": ["billing", "technical", "account", "feature_request"],
            },
            "description": {
                "type": "string",
                "minLength": 10,
                "maxLength": 2000,
            }
        },
        "required": ["title", "priority", "category", "description"]
    }
}

3. Informative Return Values

Return enough context for the model to formulate a useful response:

# BAD: Minimal return
async def get_order_status(order_id: str) -> str:
    order = await db.get_order(order_id)
    return order.status  # Just "shipped"

# GOOD: Rich return
async def get_order_status(order_id: str) -> str:
    order = await db.get_order(order_id)
    return json.dumps({
        "order_id": order.id,
        "status": order.status,
        "status_detail": "Package picked up by carrier",
        "tracking_number": order.tracking_number,
        "estimated_delivery": order.estimated_delivery.isoformat(),
        "carrier": order.carrier,
        "items": [{"name": i.name, "quantity": i.qty} for i in order.items],
    })

Error Handling in Tool Use

Tools fail. APIs time out, databases go down, and inputs may be invalid. How you report errors to the model determines whether the agent recovers gracefully or enters a failure loop.

async def execute_tool_safely(name: str, args: dict) -> dict:
    """Execute a tool with comprehensive error handling"""
    try:
        result = await execute_tool(name, args)
        return {
            "type": "tool_result",
            "tool_use_id": args["_tool_use_id"],
            "content": str(result),
        }
    except ValidationError as e:
        # Input validation error -- model can fix this
        return {
            "type": "tool_result",
            "tool_use_id": args["_tool_use_id"],
            "content": f"Input validation error: {e}. Please fix the "
                       f"arguments and try again.",
            "is_error": True,
        }
    except NotFoundError as e:
        # Resource not found -- model should tell the user
        return {
            "type": "tool_result",
            "tool_use_id": args["_tool_use_id"],
            "content": f"Not found: {e}. The requested resource does not exist.",
            "is_error": True,
        }
    except RateLimitError:
        # Transient error -- model should wait or use alternative
        return {
            "type": "tool_result",
            "tool_use_id": args["_tool_use_id"],
            "content": "Rate limit reached. This tool is temporarily unavailable. "
                       "Please try a different approach or inform the user of a brief delay.",
            "is_error": True,
        }
    except Exception as e:
        # Unexpected error -- log and return generic message
        logger.error("tool_execution_failed", tool=name, error=str(e))
        return {
            "type": "tool_result",
            "tool_use_id": args["_tool_use_id"],
            "content": "An internal error occurred. Please try an alternative "
                       "approach or let the user know you encountered a technical issue.",
            "is_error": True,
        }

Parallel Tool Execution

When the model makes multiple tool calls in a single response, execute them in parallel for lower latency:

import asyncio

async def process_tool_calls(response) -> list[dict]:
    """Execute all tool calls in parallel"""
    tool_blocks = [b for b in response.content if b.type == "tool_use"]

    if not tool_blocks:
        return []

    # Execute all tools concurrently
    tasks = [
        execute_tool_safely(block.name, {**block.input, "_tool_use_id": block.id})
        for block in tool_blocks
    ]
    results = await asyncio.gather(*tasks)

    return list(results)

Tool Use Security

Input Validation

Never trust LLM-generated tool arguments. Validate everything:

from pydantic import BaseModel, field_validator

class DatabaseQueryInput(BaseModel):
    table: str
    filters: dict
    limit: int = 10

    @field_validator("table")
    @classmethod
    def validate_table(cls, v):
        allowed_tables = ["products", "orders", "customers", "faq"]
        if v not in allowed_tables:
            raise ValueError(f"Table '{v}' not allowed. Allowed: {allowed_tables}")
        return v

    @field_validator("limit")
    @classmethod
    def validate_limit(cls, v):
        if v < 1 or v > 100:
            raise ValueError("Limit must be between 1 and 100")
        return v

    @field_validator("filters")
    @classmethod
    def validate_filters(cls, v):
        # Prevent SQL injection through filter values
        for key, value in v.items():
            if isinstance(value, str) and any(
                c in value for c in [";", "--", "DROP", "DELETE", "UPDATE"]
            ):
                raise ValueError(f"Suspicious characters in filter value: {key}")
        return v

Permission Boundaries

Implement tool-level permissions based on the user's role:

class ToolPermissionManager:
    PERMISSIONS = {
        "customer": ["search_products", "get_order_status", "get_faq"],
        "agent": ["search_products", "get_order_status", "get_faq",
                  "create_ticket", "update_ticket"],
        "admin": ["*"],  # All tools
    }

    def get_allowed_tools(self, user_role: str, all_tools: list) -> list:
        allowed = self.PERMISSIONS.get(user_role, [])
        if "*" in allowed:
            return all_tools
        return [t for t in all_tools if t["name"] in allowed]

Advanced Patterns

Dynamic Tool Registration

Add or remove tools based on conversation context:

class DynamicToolAgent:
    def __init__(self, llm_client):
        self.llm = llm_client
        self.base_tools = [search_tool, faq_tool]
        self.conditional_tools = {
            "authenticated": [order_tool, account_tool],
            "admin": [admin_tool, report_tool],
        }

    def get_tools_for_context(self, user_context: dict) -> list:
        tools = list(self.base_tools)
        if user_context.get("authenticated"):
            tools.extend(self.conditional_tools["authenticated"])
        if user_context.get("role") == "admin":
            tools.extend(self.conditional_tools["admin"])
        return tools

Tool Result Caching

Cache tool results to avoid redundant external calls:

class CachedToolExecutor:
    def __init__(self, cache_ttl: int = 300):
        self.cache = {}
        self.ttl = cache_ttl

    async def execute(self, name: str, args: dict) -> str:
        cache_key = f"{name}:{json.dumps(args, sort_keys=True)}"

        if cache_key in self.cache:
            result, timestamp = self.cache[cache_key]
            if time.time() - timestamp < self.ttl:
                return result

        result = await execute_tool(name, args)
        self.cache[cache_key] = (result, time.time())
        return result

Key Takeaways

Tool use is what transforms LLMs from conversational interfaces into capable agents. The key principles are: write detailed tool descriptions that tell the model exactly when and how to use each tool, constrain input schemas to prevent invalid arguments, handle errors in ways that help the model recover, execute parallel tool calls concurrently for latency, validate all inputs as if they were untrusted user data, and implement permission boundaries that match your security model. These patterns form the foundation for building reliable tool-using AI agents in production.

Tool Use in Large Language Models: Architecture and Best Practices