NVIDIA OpenShell: Secure Runtime for Autonomous AI Agents in Production
Deep dive into NVIDIA OpenShell's policy-based security model for autonomous AI agents — network guardrails, filesystem isolation, privacy controls, and production deployment patterns.
Why AI Agents Need a New Security Model
Traditional application security operates on a simple assumption: code is written by developers and behaves deterministically. Firewalls, access control lists, and network policies are designed around this assumption. AI agents break it. An autonomous agent generates its own actions at runtime — it decides which tools to call, what parameters to pass, what code to execute, and what data to access. The actions are non-deterministic and vary with every interaction.
This means the security model for AI agents cannot rely solely on pre-deployment code review or static network policies. It must enforce policies dynamically, at runtime, on actions that were not known at development time. This is exactly the problem NVIDIA OpenShell was built to solve.
OpenShell is an open-source secure runtime environment for AI agents, announced at GTC 2026 as part of NVIDIA's Agent Toolkit. It provides sandboxed execution with policy-based guardrails for network access, filesystem operations, code execution, and data handling. The goal is to make autonomous agents safe enough to deploy in production without requiring human approval for every action.
The OpenShell Security Architecture
OpenShell's architecture has four layers: the execution sandbox, the network guardian, the filesystem controller, and the policy engine. Each layer operates independently, and all four must approve an action before it executes. This defense-in-depth approach means that a failure in one layer does not compromise the entire system.
Layer 1: Execution Sandbox
The execution sandbox isolates each agent session in its own runtime environment. Under the hood, OpenShell uses gVisor (Google's container runtime sandbox) to provide kernel-level isolation without the overhead of full virtual machines. Each sandbox has its own process namespace, memory space, and resource limits.
# Configuring the execution sandbox
from openshell import SandboxConfig
sandbox = SandboxConfig(
isolation="gvisor", # Options: gvisor, firecracker, container
max_memory_mb=2048,
max_cpu_cores=2,
max_execution_time_seconds=300,
max_processes=50,
max_open_files=100,
allow_network=True, # Controlled by network guardian
allow_filesystem=True, # Controlled by filesystem controller
environment={
"LANG": "en_US.UTF-8",
"TZ": "UTC",
},
# Resource cleanup after session ends
cleanup_policy="destroy", # Options: destroy, snapshot, preserve
)
The sandbox supports three isolation modes. The "gvisor" mode provides strong isolation with moderate overhead — suitable for most production deployments. The "firecracker" mode uses lightweight VMs for maximum isolation, suitable for untrusted agent code or multi-tenant environments. The "container" mode provides basic Docker container isolation, suitable for development and trusted internal agents.
Layer 2: Network Guardian
The network guardian controls all egress traffic from agent sandboxes. Unlike a traditional firewall that operates on IP addresses and ports, the network guardian understands the semantic context of agent requests — it knows which tool is making the request, why, and what data is being sent.
# Network guardian configuration
from openshell import NetworkGuardian, EgressRule
guardian = NetworkGuardian(
default_policy="deny-all",
rules=[
# Allow the search tool to reach Google APIs
EgressRule(
tool="web_search",
allowed_hosts=["www.googleapis.com", "api.bing.com"],
allowed_ports=[443],
protocol="https",
max_request_size_kb=100,
max_response_size_mb=10,
),
# Allow the database tool to reach internal postgres
EgressRule(
tool="database_query",
allowed_hosts=["db.internal.company.com"],
allowed_ports=[5432],
protocol="tcp",
tls_required=True,
),
# Allow the email tool to reach the SMTP server
EgressRule(
tool="email_send",
allowed_hosts=["smtp.company.com"],
allowed_ports=[587],
protocol="smtp",
tls_required=True,
rate_limit="5/minute",
),
],
# Block all access to private IP ranges by default
block_private_ranges=True,
# DNS filtering to prevent exfiltration via DNS
dns_filtering=True,
allowed_dns_servers=["10.0.0.53"],
)
The key innovation is tool-scoped network rules. Instead of giving the entire agent process access to a list of hosts, each tool has its own network permissions. The web search tool can reach search APIs but not the database. The database tool can reach the internal database but not external APIs. This minimizes the blast radius of any compromised or misbehaving tool.
Layer 3: Filesystem Controller
The filesystem controller manages what files an agent can read, create, modify, and delete within its sandbox. It supports fine-grained permissions based on file paths, extensions, and sizes.
See AI Voice Agents Handle Real Calls
Book a free demo or calculate how much you can save with AI voice automation.
# Filesystem controller configuration
from openshell import FilesystemController, AccessRule
fs_controller = FilesystemController(
workspace="/agent/workspace",
rules=[
# Read-only access to the knowledge base
AccessRule(
path="/data/knowledge-base",
permissions="read",
allowed_extensions=[".md", ".txt", ".json", ".pdf"],
),
# Read-write access to the workspace
AccessRule(
path="/agent/workspace",
permissions="read-write",
allowed_extensions=[".py", ".json", ".csv", ".txt", ".md"],
max_file_size_mb=50,
max_total_size_mb=500,
block_symlinks=True,
),
# Write-only access to the output directory
AccessRule(
path="/agent/output",
permissions="write",
allowed_extensions=[".json", ".csv", ".pdf"],
max_file_size_mb=100,
),
],
# Prevent path traversal attacks
strict_path_validation=True,
# Log all file operations for audit
audit_all_operations=True,
)
The filesystem controller also prevents common attack patterns like path traversal (attempts to read ../../etc/passwd), symlink attacks (creating symbolic links to bypass access controls), and zip bombs (uploading compressed files that expand to fill disk).
Layer 4: Policy Engine
The policy engine is the highest-level security layer. It evaluates every agent action against a set of configurable policies before the action executes. Policies can be based on the action type, the data involved, the current session state, or external conditions.
# Policy engine configuration
from openshell import PolicyEngine, Policy, PolicyAction
policy_engine = PolicyEngine(
policies=[
# PII detection and redaction
Policy(
name="pii-protection",
trigger="data_output",
condition="contains_pii(output)",
action=PolicyAction.REDACT,
pii_types=["ssn", "credit_card", "email", "phone"],
log_level="warning",
),
# Cost control
Policy(
name="cost-limit",
trigger="tool_call",
condition="session.total_cost > 5.0",
action=PolicyAction.BLOCK,
message="Session cost limit exceeded. Requesting human approval.",
escalation="human_queue",
),
# Rate limiting
Policy(
name="tool-rate-limit",
trigger="tool_call",
condition="tool.calls_in_last_minute > 20",
action=PolicyAction.THROTTLE,
delay_seconds=10,
),
# Content safety
Policy(
name="content-safety",
trigger="agent_response",
condition="safety_score(response) < 0.8",
action=PolicyAction.BLOCK,
message="Response blocked by content safety policy.",
log_level="critical",
),
# Data residency
Policy(
name="data-residency",
trigger="network_egress",
condition="destination_region not in ['us-east-1', 'us-west-2']",
action=PolicyAction.BLOCK,
message="Data residency violation: destination outside approved regions.",
),
],
)
Policies are evaluated in order, and the first matching policy determines the action. The BLOCK action prevents the action entirely. The REDACT action modifies the output to remove sensitive data. The THROTTLE action adds a delay to prevent abuse. The ESCALATE action pauses the agent and routes to human review.
Putting It All Together: A Production Deployment
Here is a complete example of configuring OpenShell for a production agent that handles customer support inquiries. The agent can search a knowledge base, create and update support tickets, and send email responses — all within strict security guardrails.
from openshell import (
OpenShellRuntime,
SandboxConfig,
NetworkGuardian,
FilesystemController,
PolicyEngine,
EgressRule,
AccessRule,
Policy,
PolicyAction,
)
runtime = OpenShellRuntime(
sandbox=SandboxConfig(
isolation="gvisor",
max_memory_mb=2048,
max_execution_time_seconds=300,
cleanup_policy="snapshot",
),
network=NetworkGuardian(
default_policy="deny-all",
rules=[
EgressRule(
tool="knowledge_search",
allowed_hosts=["search.internal.company.com"],
allowed_ports=[443],
protocol="https",
),
EgressRule(
tool="ticket_api",
allowed_hosts=["jira.company.com"],
allowed_ports=[443],
protocol="https",
),
EgressRule(
tool="email_send",
allowed_hosts=["smtp.company.com"],
allowed_ports=[587],
rate_limit="3/minute",
),
],
),
filesystem=FilesystemController(
workspace="/agent/workspace",
rules=[
AccessRule(path="/data/kb", permissions="read"),
AccessRule(
path="/agent/workspace",
permissions="read-write",
max_total_size_mb=100,
),
],
),
policies=PolicyEngine(
policies=[
Policy(
name="pii-redact",
trigger="data_output",
condition="contains_pii(output)",
action=PolicyAction.REDACT,
),
Policy(
name="email-approval",
trigger="tool_call",
condition="tool.name == 'email_send'",
action=PolicyAction.ESCALATE,
message="Email requires human approval before sending.",
),
],
),
)
Monitoring and Incident Response
OpenShell generates detailed audit logs for every action taken within a sandbox. These logs are structured for integration with SIEM systems and include the agent session ID, timestamp, action type, tool name, parameters (with PII redacted), policy evaluation results, and outcome.
# Querying OpenShell audit logs
from openshell.audit import AuditClient
audit = AuditClient(endpoint="https://openshell-audit.internal.com")
# Find all policy violations in the last hour
violations = await audit.query(
time_range="1h",
event_type="policy_violation",
severity=["warning", "critical"],
)
for v in violations:
print(f"[{v.timestamp}] Session {v.session_id}: "
f"{v.policy_name} - {v.action_taken} - {v.details}")
# Find all network egress attempts (approved and blocked)
egress = await audit.query(
time_range="24h",
event_type="network_egress",
fields=["session_id", "tool", "destination", "approved", "bytes_sent"],
)
For incident response, OpenShell supports session replay — you can replay the entire sequence of actions an agent took during a session, including the model's reasoning, tool calls, results, and policy evaluations. This is invaluable for understanding what went wrong when an agent produces an unexpected outcome.
FAQ
How does OpenShell compare to running agents in Docker containers?
Docker containers provide process isolation but lack the agent-specific security layers that OpenShell provides. Docker does not understand tool-scoped network permissions, PII detection, cost limits, or human approval workflows. You could build these on top of Docker, but OpenShell provides them out of the box. Additionally, OpenShell's gVisor and Firecracker isolation modes provide stronger security guarantees than standard Docker containers for untrusted code execution.
What is the performance overhead of OpenShell?
In NVIDIA's benchmarks, OpenShell adds approximately 15-30ms of latency per tool call for policy evaluation, and the gVisor sandbox adds approximately 5-10% overhead on compute-intensive operations compared to bare metal. For most agent workloads where the dominant latency is model inference (hundreds of milliseconds to seconds), the OpenShell overhead is negligible. The Firecracker isolation mode has higher overhead (approximately 100ms per sandbox creation) but provides stronger isolation.
Can I use OpenShell without the rest of the NVIDIA Agent Toolkit?
Yes. OpenShell is a standalone open-source project (Apache 2.0) that can be used with any agent framework. It provides a Python SDK and a REST API for managing sandboxes. If you are using LangChain, CrewAI, AutoGen, or a custom framework, you can wrap your tool execution calls in OpenShell sandboxes to get the security benefits without adopting the full NVIDIA toolkit.
How does OpenShell handle agents that need to learn and persist state?
OpenShell sandboxes are ephemeral by default — they are destroyed after each session. For agents that need persistent state (memory, learned preferences, accumulated knowledge), OpenShell provides a state management API that stores session state in an external database, encrypted and access-controlled. The snapshot cleanup policy captures the sandbox state at session end, which can be loaded into a new sandbox for the next session.
#OpenShell #NVIDIA #AgentSecurity #ProductionAI #Guardrails #AgenticAI #gVisor #PolicyEngine #RuntimeSecurity
Written by
CallSphere Team
Expert insights on AI voice agents and customer communication automation.
Try CallSphere AI Voice Agents
See how AI voice agents work for your industry. Live demo available -- no signup required.