Orchestrating Multiple MCP Servers: Building a Tool Ecosystem for Complex Agents
Design and implement multi-server MCP architectures where agents connect to multiple tool providers simultaneously, with namespace management, conflict resolution, and performance optimization.
The Multi-Server Reality
Production AI agents rarely connect to a single MCP server. A customer support agent might need a CRM server for customer data, a knowledge base server for documentation, a ticketing server for issue management, and an analytics server for usage metrics. Each server is maintained by a different team, deployed independently, and versioned on its own schedule.
Orchestrating these servers into a cohesive tool ecosystem is an architectural challenge that touches on naming conflicts, connection management, error isolation, and performance.
Connecting Multiple Servers
The OpenAI Agents SDK supports multiple MCP servers natively. Each server is an independent connection:
from agents import Agent
from agents.mcp import MCPServerStdio, MCPServerStreamableHTTP
# Local server for file operations
filesystem_server = MCPServerStdio(
name="Filesystem",
params={
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-filesystem", "/data"],
},
cache_tools_list=True,
)
# Remote server for database queries
database_server = MCPServerStreamableHTTP(
name="Database",
params={"url": "http://db-mcp:8001/mcp"},
cache_tools_list=True,
)
# Remote server for monitoring
monitoring_server = MCPServerStreamableHTTP(
name="Monitoring",
params={"url": "http://monitoring-mcp:8002/mcp"},
cache_tools_list=True,
)
agent = Agent(
name="Operations Assistant",
instructions="""You help with operational tasks. You have access to:
- Filesystem tools for reading and writing config files
- Database tools for querying application data
- Monitoring tools for checking system health and metrics
Always check system health before making database changes.""",
mcp_servers=[filesystem_server, database_server, monitoring_server],
)
When the agent starts, it calls tools/list on each server and presents the combined tool set to the LLM. The LLM sees a flat list of tools from all servers and can call any of them.
Namespace Management
The biggest risk with multiple servers is tool name collisions. If the database server and the monitoring server both expose a tool called query, the agent cannot distinguish between them. Solve this at the server level by using descriptive, namespaced tool names:
# Bad: generic names that will collide
@mcp_server.tool()
async def query(sql: str) -> str: ...
@mcp_server.tool()
async def search(term: str) -> str: ...
# Good: namespaced names that are unambiguous
@mcp_server.tool()
async def db_query(sql: str) -> str:
"""Execute a SQL query against the application database."""
...
@mcp_server.tool()
async def monitoring_search_alerts(term: str) -> str:
"""Search monitoring alerts by keyword."""
...
A naming convention like {domain}_{action} or {domain}_{action}_{target} prevents collisions and helps the LLM understand which server a tool belongs to.
Connection Lifecycle Management
Each MCP server connection has a lifecycle — initialization, active use, and cleanup. With multiple servers, manage these lifecycles carefully to avoid resource leaks:
from agents import Agent, Runner
from agents.mcp import MCPServerStdio, MCPServerStreamableHTTP
async def run_with_multiple_servers(user_message: str):
"""Run an agent with proper multi-server lifecycle management."""
servers = [
MCPServerStdio(
name="Filesystem",
params={"command": "npx", "args": ["-y", "@mcp/server-fs", "/data"]},
cache_tools_list=True,
),
MCPServerStreamableHTTP(
name="Database",
params={"url": "http://db-mcp:8001/mcp"},
cache_tools_list=True,
),
]
# Use async context managers to ensure cleanup
async with servers[0] as fs_server, servers[1] as db_server:
agent = Agent(
name="Assistant",
instructions="Help with file and database operations.",
mcp_servers=[fs_server, db_server],
)
result = await Runner.run(agent, user_message)
return result.final_output
The async with pattern ensures that every server connection is properly closed, even if an error occurs. For stdio servers, this means the subprocess is terminated. For HTTP servers, this means the session is closed.
See AI Voice Agents Handle Real Calls
Book a free demo or calculate how much you can save with AI voice automation.
Error Isolation
When one server fails, the agent should continue functioning with the remaining servers. Implement error isolation so that a single server outage does not crash the entire agent:
import asyncio
import json
async def resilient_tool_discovery(servers: list) -> dict:
"""Discover tools from multiple servers with error isolation."""
all_tools = {}
async def discover_single(server):
try:
tools = await asyncio.wait_for(
server.list_tools(),
timeout=5.0,
)
return server.name, tools
except asyncio.TimeoutError:
print(f"Warning: {server.name} timed out during discovery")
return server.name, []
except Exception as e:
print(f"Warning: {server.name} failed: {e}")
return server.name, []
results = await asyncio.gather(
*[discover_single(s) for s in servers]
)
for name, tools in results:
if tools:
all_tools[name] = tools
print(f"Discovered {len(tools)} tools from {name}")
else:
print(f"No tools available from {name}")
return all_tools
Performance Optimization
With multiple servers, tool discovery latency multiplies. Apply these optimizations to keep startup fast.
First, enable cache_tools_list=True on every server so discovery happens only once.
Second, initialize servers in parallel rather than sequentially:
async def parallel_server_init(servers: list):
"""Initialize all MCP servers concurrently."""
init_tasks = [server.__aenter__() for server in servers]
results = await asyncio.gather(*init_tasks, return_exceptions=True)
healthy_servers = []
for server, result in zip(servers, results):
if isinstance(result, Exception):
print(f"Failed to initialize {server.name}: {result}")
else:
healthy_servers.append(server)
return healthy_servers
Third, if certain servers are only needed for specific tasks, connect to them lazily — only when the agent first attempts to use a tool from that server.
Routing Strategies
For complex agents with many servers, consider implementing a routing layer that directs tool calls to the appropriate server based on the tool name prefix:
class ServerRouter:
"""Route tool calls to the correct MCP server by namespace."""
def __init__(self):
self._routes: dict[str, object] = {}
def register(self, prefix: str, server):
"""Register a server for a tool name prefix."""
self._routes[prefix] = server
def resolve(self, tool_name: str):
"""Find the server responsible for a tool."""
for prefix, server in self._routes.items():
if tool_name.startswith(prefix):
return server
return None
router = ServerRouter()
router.register("db_", database_server)
router.register("fs_", filesystem_server)
router.register("monitor_", monitoring_server)
FAQ
Is there a limit to how many MCP servers an agent can connect to?
There is no protocol-level limit, but practical limits exist. Each server adds tools to the LLM's context window. With 10 servers exposing 15 tools each, you have 150 tools — which consumes significant context space and can degrade the LLM's ability to choose the right tool. Keep the active tool set under 30-40 tools by connecting only the servers relevant to the current task.
How do I handle a server that becomes unresponsive mid-conversation?
Implement timeouts on every tool call. If a call to a specific server times out, return an error message to the agent that identifies which server is unavailable. The LLM can then adjust its plan — either retrying later, using an alternative approach, or informing the user that a specific capability is temporarily unavailable.
Can different agents share the same MCP server connections?
For HTTP transport servers, yes — multiple agents can connect to the same server concurrently since each request is independent. For stdio servers, each agent typically needs its own subprocess because stdio is a single-client transport. If you need multiple agents to share a local tool, wrap it in an HTTP server instead.
#MCP #MultiServer #Architecture #AIAgents #Orchestration #AgenticAI #LearnAI #AIEngineering
CallSphere Team
Expert insights on AI voice agents and customer communication automation.
Try CallSphere AI Voice Agents
See how AI voice agents work for your industry. Live demo available -- no signup required.