File System Tools: Building AI Agents That Read, Write, and Manage Files
Build secure file system tools that let AI agents read, write, list, and manage files within a sandboxed directory. Covers path validation, sandboxing strategies, permissions, and safety considerations for production deployment.
When Agents Need File Access
Code generation agents need to write files. Data analysis agents need to read CSVs. Report generation agents need to save output. File system tools are essential for agents that produce artifacts or work with local data. But giving an LLM write access to your filesystem is one of the most dangerous tool categories. The entire design must center on sandboxing and validation.
The Sandbox: Containment First
Every file operation must be confined to a sandbox directory. The agent should have no way to escape it:
from pathlib import Path
class FileSystemSandbox:
def __init__(self, sandbox_dir: str):
self.sandbox = Path(sandbox_dir).resolve()
self.sandbox.mkdir(parents=True, exist_ok=True)
def resolve_path(self, relative_path: str) -> Path:
"""Resolve a path and ensure it is inside the sandbox."""
# Normalize the path and resolve any symlinks
target = (self.sandbox / relative_path).resolve()
# Critical security check: ensure resolved path is within sandbox
if not str(target).startswith(str(self.sandbox)):
raise PermissionError(
f"Path traversal detected: {relative_path} resolves outside sandbox"
)
return target
The resolve() call is critical. It collapses path traversal sequences like ../../etc/passwd to their actual filesystem location. Then the prefix check ensures the resolved path is still inside the sandbox. Without this, an LLM could trivially escape the sandbox using relative paths.
Tool 1: Read File
class FileTools:
def __init__(self, sandbox_dir: str, max_file_size: int = 1_000_000):
self.sandbox = FileSystemSandbox(sandbox_dir)
self.max_file_size = max_file_size
def read_file(self, path: str) -> str:
try:
target = self.sandbox.resolve_path(path)
if not target.exists():
return f"Error: File not found: {path}"
if not target.is_file():
return f"Error: {path} is not a file"
file_size = target.stat().st_size
if file_size > self.max_file_size:
return f"Error: File is {file_size} bytes, exceeds limit of {self.max_file_size}"
content = target.read_text(encoding="utf-8")
if len(content) > 10000:
content = content[:10000] + f"\n\n[Truncated: showing first 10000 of {len(content)} characters]"
return content
except PermissionError as e:
return f"Error: {str(e)}"
except UnicodeDecodeError:
return f"Error: {path} is not a text file"
The size check prevents the agent from reading massive files that would blow up the context window. The truncation provides a reasonable preview while indicating there is more content.
Tool 2: Write File
def write_file(self, path: str, content: str) -> str:
BLOCKED_EXTENSIONS = {".exe", ".sh", ".bat", ".cmd", ".ps1", ".dll", ".so"}
try:
target = self.sandbox.resolve_path(path)
if target.suffix.lower() in BLOCKED_EXTENSIONS:
return f"Error: Cannot write executable files ({target.suffix})"
target.parent.mkdir(parents=True, exist_ok=True)
# Ensure parent is still in sandbox after mkdir
parent_resolved = target.parent.resolve()
if not str(parent_resolved).startswith(str(self.sandbox.sandbox)):
return "Error: Cannot create directories outside sandbox"
target.write_text(content, encoding="utf-8")
return f"Successfully wrote {len(content)} characters to {path}"
except PermissionError as e:
return f"Error: {str(e)}"
Blocking executable extensions prevents the agent from creating scripts that could be accidentally run. The parent directory check after mkdir is an extra precaution against race conditions.
See AI Voice Agents Handle Real Calls
Book a free demo or calculate how much you can save with AI voice automation.
Tool 3: List Directory
def list_directory(self, path: str = ".") -> str:
import json
try:
target = self.sandbox.resolve_path(path)
if not target.is_dir():
return f"Error: {path} is not a directory"
entries = []
for item in sorted(target.iterdir()):
relative = item.relative_to(self.sandbox.sandbox)
entry = {
"name": item.name,
"path": str(relative),
"type": "directory" if item.is_dir() else "file",
}
if item.is_file():
entry["size"] = item.stat().st_size
entries.append(entry)
if not entries:
return "Directory is empty."
return json.dumps(entries, indent=2)
except PermissionError as e:
return f"Error: {str(e)}"
Tool Schemas
file_tools_schemas = [
{
"type": "function",
"function": {
"name": "read_file",
"description": "Read the contents of a text file. Returns the file content as a string. Files larger than 10000 characters are truncated.",
"parameters": {
"type": "object",
"properties": {
"path": {
"type": "string",
"description": "Relative path to the file within the workspace"
}
},
"required": ["path"]
}
}
},
{
"type": "function",
"function": {
"name": "write_file",
"description": "Write content to a file. Creates the file if it does not exist, overwrites if it does. Parent directories are created automatically.",
"parameters": {
"type": "object",
"properties": {
"path": {
"type": "string",
"description": "Relative path for the file within the workspace"
},
"content": {
"type": "string",
"description": "The full content to write to the file"
}
},
"required": ["path", "content"]
}
}
},
{
"type": "function",
"function": {
"name": "list_directory",
"description": "List files and subdirectories in a directory. Returns JSON with names, types, and sizes.",
"parameters": {
"type": "object",
"properties": {
"path": {
"type": "string",
"description": "Relative path to the directory. Use '.' for the workspace root.",
"default": "."
}
},
"required": []
}
}
}
]
Additional Safety: Quotas and Audit Logging
In production, add write quotas and logging:
class AuditedFileTools(FileTools):
def __init__(self, sandbox_dir: str, max_writes: int = 50):
super().__init__(sandbox_dir)
self.write_count = 0
self.max_writes = max_writes
self.audit_log = []
def write_file(self, path: str, content: str) -> str:
if self.write_count >= self.max_writes:
return f"Error: Write limit reached ({self.max_writes} writes per session)"
result = super().write_file(path, content)
if result.startswith("Successfully"):
self.write_count += 1
self.audit_log.append({
"action": "write",
"path": path,
"size": len(content),
"timestamp": time.time(),
})
return result
This prevents runaway agents from creating thousands of files and gives you a record of everything the agent wrote.
FAQ
Should I use Docker containers instead of path sandboxing?
For production deployments, yes. Docker containers provide OS-level isolation that path validation alone cannot match. Use path sandboxing as an inner defense layer and Docker as the outer layer. For development and prototyping, path sandboxing is sufficient.
How do I handle binary files like images or PDFs?
Add separate tools with different read/write implementations. For images, return metadata (dimensions, format) rather than content. For PDFs, extract text using a library like PyMuPDF and return the text. Never try to pass raw binary content back to the LLM as a tool result.
Can I let the agent delete files?
You can, but add a soft-delete mechanism instead of permanent deletion. Move deleted files to a trash directory within the sandbox. This lets you recover from agent mistakes and provides an audit trail of what was removed.
#FileSystem #Security #ToolDesign #AIAgents #Python #AgenticAI #LearnAI #AIEngineering
CallSphere Team
Expert insights on AI voice agents and customer communication automation.
Try CallSphere AI Voice Agents
See how AI voice agents work for your industry. Live demo available -- no signup required.