Security and Sandboxing for Claude Computer Use Agents: Safe Browser Automation

The Security Challenge

Claude Computer Use agents can see your screen and control your keyboard and mouse. This is powerful — and dangerous. An agent with unrestricted computer access could accidentally navigate to a sensitive page, type credentials into the wrong field, execute destructive commands, or click a "Delete All" button when it meant to click "Download All."

Anthropic explicitly recommends running computer use in sandboxed environments with limited privileges. This article covers the practical architecture for doing that safely, from VM isolation to audit logging.

Principle of Least Privilege

The foundational security principle for computer use agents is to give them access to the minimum set of capabilities needed for their task. This applies at every level:

Network: Only allow access to the specific domains the agent needs
Actions: Restrict which mouse/keyboard actions are permitted
Screen area: Limit which parts of the screen the agent can see and interact with
Time: Set maximum execution time to prevent runaway sessions
Data: Never expose credentials or sensitive data on screen

VM Isolation

Run computer use agents in isolated virtual machines or containers. Anthropic provides a reference Docker container, and you should build on this pattern:

# docker-compose.yml for a sandboxed computer use environment
# Use a dedicated Docker Compose configuration

import subprocess
import json

def launch_sandbox(task: str, allowed_urls: list[str]) -> str:
    """Launch a sandboxed container for a computer use task."""
    # Build network restriction rules
    iptables_rules = generate_network_rules(allowed_urls)

    container_config = {
        "image": "anthropic/computer-use:latest",
        "environment": {
            "ANTHROPIC_API_KEY": "${ANTHROPIC_API_KEY}",
            "TASK": task,
            "MAX_STEPS": "30",
            "TIMEOUT_SECONDS": "300",
        },
        "network_mode": "none",  # No network by default
        "read_only": True,
        "tmpfs": {"/tmp": "size=100M"},
        "mem_limit": "2g",
        "cpu_quota": 100000,
        "security_opt": ["no-new-privileges"],
    }

    result = subprocess.run(
        ["docker", "run", "--rm",
         "--network", "none",
         "--read-only",
         "--tmpfs", "/tmp:size=100M",
         "--memory", "2g",
         "--cpus", "1.0",
         "--security-opt", "no-new-privileges",
         "-e", f"TASK={task}",
         "anthropic/computer-use:latest"],
        capture_output=True,
        text=True,
        timeout=600,
    )
    return result.stdout

def generate_network_rules(allowed_urls: list[str]) -> str:
    """Generate iptables rules to restrict network access."""
    rules = [
        "iptables -P OUTPUT DROP",  # Default deny outbound
        "iptables -A OUTPUT -d 127.0.0.0/8 -j ACCEPT",  # Allow localhost
    ]
    for url in allowed_urls:
        # Resolve domain to IP and allow
        from urllib.parse import urlparse
        domain = urlparse(url).hostname
        rules.append(f"iptables -A OUTPUT -d $(dig +short {domain}) -j ACCEPT")

    # Always allow Anthropic API
    rules.append("iptables -A OUTPUT -d api.anthropic.com -j ACCEPT")
    return "\n".join(rules)

Action Allowlists

Not every task needs every action type. An agent that reads data from a dashboard should not need to type or press keys. Implement an action filter that sits between Claude's response and the execution layer:

from dataclasses import dataclass, field
from typing import Optional

@dataclass
class ActionPolicy:
    allow_click: bool = True
    allow_type: bool = False
    allow_key: bool = False
    allow_scroll: bool = True
    allowed_key_combos: list[str] = field(default_factory=list)
    blocked_regions: list[dict] = field(default_factory=list)
    max_actions_per_minute: int = 30

class ActionFilter:
    def __init__(self, policy: ActionPolicy):
        self.policy = policy
        self.action_timestamps: list[float] = []

    def validate(self, action: dict) -> tuple[bool, str]:
        """Validate an action against the security policy."""
        import time

        action_type = action.get("action", action.get("type"))

        # Check action type is allowed
        type_checks = {
            "click": self.policy.allow_click,
            "type": self.policy.allow_type,
            "key": self.policy.allow_key,
            "scroll": self.policy.allow_scroll,
        }

        if not type_checks.get(action_type, False):
            return False, f"Action type '{action_type}' is not permitted by policy"

        # Check key combinations against allowlist
        if action_type == "key" and self.policy.allowed_key_combos:
            if action.get("text") not in self.policy.allowed_key_combos:
                return False, f"Key combo '{action.get('text')}' is not in the allowlist"

        # Check blocked regions
        if "coordinate" in action:
            x, y = action["coordinate"]
            for region in self.policy.blocked_regions:
                if (region["x1"] <= x <= region["x2"] and
                    region["y1"] <= y <= region["y2"]):
                    return False, f"Coordinate ({x}, {y}) is in a blocked region"

        # Rate limiting
        now = time.time()
        self.action_timestamps = [
            t for t in self.action_timestamps if now - t < 60
        ]
        if len(self.action_timestamps) >= self.policy.max_actions_per_minute:
            return False, "Rate limit exceeded"

        self.action_timestamps.append(now)
        return True, "Action permitted"

# Usage: read-only agent that can only click and scroll
readonly_policy = ActionPolicy(
    allow_click=True,
    allow_type=False,
    allow_key=False,
    allow_scroll=True,
    max_actions_per_minute=20,
)

action_filter = ActionFilter(readonly_policy)

Credential Handling

Never let credentials appear on screen where Claude can see them. Use these patterns instead:

See AI Voice Agents Handle Real Calls

Book a free demo or calculate how much you can save with AI voice automation.

Book a Demo ROI Calculator

class SecureCredentialHandler:
    """Handle credentials without exposing them to the vision model."""

    def __init__(self, browser):
        self.browser = browser

    async def fill_credentials(self, username: str, password: str):
        """Fill login fields using Playwright direct DOM access, not vision."""
        # Use Playwright's fill method which bypasses the screen entirely
        await self.browser.page.fill(
            "input[type='text'], input[type='email'], input[name='username']",
            username,
        )
        await self.browser.page.fill(
            "input[type='password']",
            password,
        )
        await self.browser.page.click(
            "button[type='submit'], input[type='submit']"
        )

    async def inject_auth_cookie(self, cookie: dict):
        """Set authentication cookies directly without logging in visually."""
        await self.browser.page.context.add_cookies([cookie])

    async def inject_auth_header(self, token: str):
        """Set authorization headers on all requests."""
        await self.browser.page.set_extra_http_headers({
            "Authorization": f"Bearer {token}"
        })

The key principle is to handle authentication through the browser's programmatic API (Playwright), never through the vision-based computer use loop. Claude should never see a password field with a password in it.

Audit Logging

Every action the agent takes should be logged with full context for security review:

import json
import time
from datetime import datetime, timezone

class AuditLogger:
    def __init__(self, log_path: str, task_id: str):
        self.log_path = log_path
        self.task_id = task_id
        self.step_count = 0

    def log_action(self, action: dict, screenshot_path: str,
                   claude_response: dict, permitted: bool, reason: str = ""):
        """Log every action for security audit."""
        self.step_count += 1

        entry = {
            "timestamp": datetime.now(timezone.utc).isoformat(),
            "task_id": self.task_id,
            "step": self.step_count,
            "action": action,
            "screenshot_saved": screenshot_path,
            "model_response_id": claude_response.get("id"),
            "tokens_used": {
                "input": claude_response.get("usage", {}).get("input_tokens"),
                "output": claude_response.get("usage", {}).get("output_tokens"),
            },
            "permitted": permitted,
            "denial_reason": reason if not permitted else None,
        }

        with open(self.log_path, "a") as f:
            f.write(json.dumps(entry) + "\n")

    def save_screenshot(self, screenshot_b64: str, step: int) -> str:
        """Save screenshot to disk for audit trail."""
        import base64
        path = f"/var/log/computer-use/screenshots/{self.task_id}_step_{step}.png"
        with open(path, "wb") as f:
            f.write(base64.standard_b64decode(screenshot_b64))
        return path

Putting It All Together

Here is the secure agent loop that combines all safety layers:

class SecureBrowserAgent:
    def __init__(self, browser, policy: ActionPolicy, task_id: str):
        self.browser = browser
        self.client = anthropic.Anthropic()
        self.filter = ActionFilter(policy)
        self.logger = AuditLogger("/var/log/computer-use/audit.jsonl", task_id)
        self.credentials = SecureCredentialHandler(browser)

    async def run(self, task: str, max_steps: int = 30):
        messages = [{"role": "user", "content": task}]

        for step in range(max_steps):
            screenshot_b64 = await self.browser.screenshot()
            screenshot_path = self.logger.save_screenshot(screenshot_b64, step)

            # Send to Claude
            messages.append({"role": "user", "content": [{
                "type": "image",
                "source": {"type": "base64", "media_type": "image/png", "data": screenshot_b64},
            }]})

            response = self.client.messages.create(
                model="claude-sonnet-4-20250514",
                max_tokens=1024,
                tools=[self.browser.get_tool_config()],
                messages=messages,
            )

            if response.stop_reason == "end_turn":
                return "Task complete"

            for block in response.content:
                if block.type == "tool_use":
                    # Security gate: validate before executing
                    permitted, reason = self.filter.validate(block.input)
                    self.logger.log_action(
                        block.input, screenshot_path,
                        {"id": response.id, "usage": response.usage.__dict__},
                        permitted, reason,
                    )

                    if not permitted:
                        # Deny the action and inform Claude
                        messages.append({"role": "assistant", "content": response.content})
                        messages.append({"role": "user", "content": [{
                            "type": "tool_result",
                            "tool_use_id": block.id,
                            "content": f"ACTION DENIED: {reason}. Choose a different approach.",
                            "is_error": True,
                        }]})
                        continue

                    await self._execute(block.input)
                    messages.append({"role": "assistant", "content": response.content})
                    messages.append({"role": "user", "content": [{
                        "type": "tool_result",
                        "tool_use_id": block.id,
                        "content": "Action executed",
                    }]})

FAQ

What if Claude tries to access a URL not in the allowlist?

With network-level restrictions (iptables or Docker network policies), the request will simply fail. The browser will show a connection error, and Claude will see this in the next screenshot. Combine network restrictions with the action filter to also deny clicks on links that would navigate to unauthorized domains.

How do I handle sensitive data that appears on screen during automation?

Implement screenshot redaction for known sensitive areas. Before sending screenshots to Claude, use image processing to black out regions where sensitive data might appear (e.g., the account balance section on a banking page). Alternatively, use CSS injection via Playwright to hide sensitive elements before taking the screenshot.

Should I use human-in-the-loop approval for all actions?

For high-risk tasks (financial transactions, data deletion, account modifications), yes. Implement a confirmation step where the agent pauses and presents its intended action to a human operator for approval before executing. For low-risk tasks (reading data, navigating pages), automated policies are sufficient.

#SecuritySandboxing #ClaudeComputerUse #AgentSecurity #VMIsolation #AuditLogging #SafeAutomation #BrowserSecurity #CredentialSafety

Security and Sandboxing for Claude Computer Use Agents: Safe Browser Automation

The Security Challenge

Principle of Least Privilege

VM Isolation

Action Allowlists

Credential Handling

Audit Logging

Putting It All Together

FAQ

What if Claude tries to access a URL not in the allowlist?

How do I handle sensitive data that appears on screen during automation?

Should I use human-in-the-loop approval for all actions?

Try CallSphere AI Voice Agents

Related Articles

WebArena and Real-World Web Agent Benchmarks: How We Measure Browser Agent Performance

Taking Screenshots and Recording Videos with Playwright for AI Analysis

Playwright Selectors Deep Dive: CSS, XPath, Text, and Role-Based Element Finding