Tool Use in Large Language Models: Architecture and Best Practices
A deep technical guide to implementing tool use (function calling) in LLM applications, covering tool design principles, error handling, parallel execution, security, and advanced patterns for building reliable tool-using AI agents.
What Is Tool Use?
Tool use (also called function calling) is the mechanism by which an LLM can invoke external functions during a conversation. Instead of generating text alone, the model outputs a structured request to call a specific tool with specific arguments. The application executes the tool and returns the result, which the model then uses to continue its response.
This capability transforms LLMs from pure text generators into agents that can interact with the real world: querying databases, calling APIs, reading files, performing calculations, and executing code.
How Tool Use Works Internally
The Conversation Flow
1. User sends a message + tool definitions
2. Model decides whether to use a tool (or respond directly)
3. If tool use: Model outputs a tool_use block with name + arguments
4. Application executes the tool and sends back tool_result
5. Model incorporates the result and continues
6. Steps 2-5 repeat until the model responds directly (end_turn)
Implementation with Claude
import anthropic
client = anthropic.Anthropic()
# Define tools
tools = [
{
"name": "get_weather",
"description": "Get the current weather for a given city. "
"Returns temperature, conditions, and humidity.",
"input_schema": {
"type": "object",
"properties": {
"city": {
"type": "string",
"description": "City name (e.g., 'San Francisco, CA')"
},
"units": {
"type": "string",
"enum": ["celsius", "fahrenheit"],
"description": "Temperature units",
"default": "fahrenheit"
}
},
"required": ["city"]
}
},
{
"name": "search_products",
"description": "Search the product catalog by keyword. "
"Returns matching products with prices.",
"input_schema": {
"type": "object",
"properties": {
"query": {"type": "string", "description": "Search query"},
"category": {"type": "string", "description": "Product category filter"},
"max_price": {"type": "number", "description": "Maximum price filter"},
"limit": {"type": "integer", "default": 5, "maximum": 20}
},
"required": ["query"]
}
}
]
# Tool execution handlers
async def execute_tool(name: str, args: dict) -> str:
if name == "get_weather":
return await weather_api.get(args["city"], args.get("units", "fahrenheit"))
elif name == "search_products":
return await product_db.search(**args)
else:
return f"Unknown tool: {name}"
# The agent loop
async def agent_loop(user_message: str, max_turns: int = 10) -> str:
messages = [{"role": "user", "content": user_message}]
for turn in range(max_turns):
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=4096,
tools=tools,
messages=messages,
)
# Check if model wants to use tools
if response.stop_reason == "end_turn":
# Model is done -- extract text response
return next(b.text for b in response.content if b.type == "text")
# Process tool calls
messages.append({"role": "assistant", "content": response.content})
tool_results = []
for block in response.content:
if block.type == "tool_use":
result = await execute_tool(block.name, block.input)
tool_results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": str(result),
})
messages.append({"role": "user", "content": tool_results})
return "Agent exceeded maximum turns"
Tool Design Principles
1. Clear, Specific Descriptions
The tool description is the most important factor in whether the model uses the tool correctly. Be specific about what the tool does, what it returns, and when to use it:
# BAD: Vague description
{
"name": "search",
"description": "Search for stuff"
}
# GOOD: Specific description
{
"name": "search_knowledge_base",
"description": "Search the company knowledge base for articles, "
"FAQs, and documentation. Returns up to 5 matching "
"articles with titles, snippets, and URLs. Use this "
"when the user asks about company policies, product "
"features, or troubleshooting steps. Do NOT use for "
"general knowledge questions."
}
2. Constrained Input Schemas
Use enums, min/max values, and required fields to constrain what the model can pass:
{
"name": "create_ticket",
"description": "Create a support ticket in the ticketing system",
"input_schema": {
"type": "object",
"properties": {
"title": {
"type": "string",
"maxLength": 200,
"description": "Brief title describing the issue"
},
"priority": {
"type": "string",
"enum": ["low", "medium", "high", "critical"],
"description": "Ticket priority. Use 'critical' only for "
"production outages affecting multiple users."
},
"category": {
"type": "string",
"enum": ["billing", "technical", "account", "feature_request"],
},
"description": {
"type": "string",
"minLength": 10,
"maxLength": 2000,
}
},
"required": ["title", "priority", "category", "description"]
}
}
3. Informative Return Values
Return enough context for the model to formulate a useful response:
# BAD: Minimal return
async def get_order_status(order_id: str) -> str:
order = await db.get_order(order_id)
return order.status # Just "shipped"
# GOOD: Rich return
async def get_order_status(order_id: str) -> str:
order = await db.get_order(order_id)
return json.dumps({
"order_id": order.id,
"status": order.status,
"status_detail": "Package picked up by carrier",
"tracking_number": order.tracking_number,
"estimated_delivery": order.estimated_delivery.isoformat(),
"carrier": order.carrier,
"items": [{"name": i.name, "quantity": i.qty} for i in order.items],
})
Error Handling in Tool Use
Tools fail. APIs time out, databases go down, and inputs may be invalid. How you report errors to the model determines whether the agent recovers gracefully or enters a failure loop.
async def execute_tool_safely(name: str, args: dict) -> dict:
"""Execute a tool with comprehensive error handling"""
try:
result = await execute_tool(name, args)
return {
"type": "tool_result",
"tool_use_id": args["_tool_use_id"],
"content": str(result),
}
except ValidationError as e:
# Input validation error -- model can fix this
return {
"type": "tool_result",
"tool_use_id": args["_tool_use_id"],
"content": f"Input validation error: {e}. Please fix the "
f"arguments and try again.",
"is_error": True,
}
except NotFoundError as e:
# Resource not found -- model should tell the user
return {
"type": "tool_result",
"tool_use_id": args["_tool_use_id"],
"content": f"Not found: {e}. The requested resource does not exist.",
"is_error": True,
}
except RateLimitError:
# Transient error -- model should wait or use alternative
return {
"type": "tool_result",
"tool_use_id": args["_tool_use_id"],
"content": "Rate limit reached. This tool is temporarily unavailable. "
"Please try a different approach or inform the user of a brief delay.",
"is_error": True,
}
except Exception as e:
# Unexpected error -- log and return generic message
logger.error("tool_execution_failed", tool=name, error=str(e))
return {
"type": "tool_result",
"tool_use_id": args["_tool_use_id"],
"content": "An internal error occurred. Please try an alternative "
"approach or let the user know you encountered a technical issue.",
"is_error": True,
}
Parallel Tool Execution
When the model makes multiple tool calls in a single response, execute them in parallel for lower latency:
import asyncio
async def process_tool_calls(response) -> list[dict]:
"""Execute all tool calls in parallel"""
tool_blocks = [b for b in response.content if b.type == "tool_use"]
if not tool_blocks:
return []
# Execute all tools concurrently
tasks = [
execute_tool_safely(block.name, {**block.input, "_tool_use_id": block.id})
for block in tool_blocks
]
results = await asyncio.gather(*tasks)
return list(results)
Tool Use Security
Input Validation
Never trust LLM-generated tool arguments. Validate everything:
from pydantic import BaseModel, field_validator
class DatabaseQueryInput(BaseModel):
table: str
filters: dict
limit: int = 10
@field_validator("table")
@classmethod
def validate_table(cls, v):
allowed_tables = ["products", "orders", "customers", "faq"]
if v not in allowed_tables:
raise ValueError(f"Table '{v}' not allowed. Allowed: {allowed_tables}")
return v
@field_validator("limit")
@classmethod
def validate_limit(cls, v):
if v < 1 or v > 100:
raise ValueError("Limit must be between 1 and 100")
return v
@field_validator("filters")
@classmethod
def validate_filters(cls, v):
# Prevent SQL injection through filter values
for key, value in v.items():
if isinstance(value, str) and any(
c in value for c in [";", "--", "DROP", "DELETE", "UPDATE"]
):
raise ValueError(f"Suspicious characters in filter value: {key}")
return v
Permission Boundaries
Implement tool-level permissions based on the user's role:
class ToolPermissionManager:
PERMISSIONS = {
"customer": ["search_products", "get_order_status", "get_faq"],
"agent": ["search_products", "get_order_status", "get_faq",
"create_ticket", "update_ticket"],
"admin": ["*"], # All tools
}
def get_allowed_tools(self, user_role: str, all_tools: list) -> list:
allowed = self.PERMISSIONS.get(user_role, [])
if "*" in allowed:
return all_tools
return [t for t in all_tools if t["name"] in allowed]
Advanced Patterns
Dynamic Tool Registration
Add or remove tools based on conversation context:
class DynamicToolAgent:
def __init__(self, llm_client):
self.llm = llm_client
self.base_tools = [search_tool, faq_tool]
self.conditional_tools = {
"authenticated": [order_tool, account_tool],
"admin": [admin_tool, report_tool],
}
def get_tools_for_context(self, user_context: dict) -> list:
tools = list(self.base_tools)
if user_context.get("authenticated"):
tools.extend(self.conditional_tools["authenticated"])
if user_context.get("role") == "admin":
tools.extend(self.conditional_tools["admin"])
return tools
Tool Result Caching
Cache tool results to avoid redundant external calls:
class CachedToolExecutor:
def __init__(self, cache_ttl: int = 300):
self.cache = {}
self.ttl = cache_ttl
async def execute(self, name: str, args: dict) -> str:
cache_key = f"{name}:{json.dumps(args, sort_keys=True)}"
if cache_key in self.cache:
result, timestamp = self.cache[cache_key]
if time.time() - timestamp < self.ttl:
return result
result = await execute_tool(name, args)
self.cache[cache_key] = (result, time.time())
return result
Key Takeaways
Tool use is what transforms LLMs from conversational interfaces into capable agents. The key principles are: write detailed tool descriptions that tell the model exactly when and how to use each tool, constrain input schemas to prevent invalid arguments, handle errors in ways that help the model recover, execute parallel tool calls concurrently for latency, validate all inputs as if they were untrusted user data, and implement permission boundaries that match your security model. These patterns form the foundation for building reliable tool-using AI agents in production.
NYC News
Expert insights on AI voice agents and customer communication automation.
Try CallSphere AI Voice Agents
See how AI voice agents work for your industry. Live demo available -- no signup required.