Skip to content
Learn Agentic AI10 min read0 views

MRKL Architecture: Modular Reasoning, Knowledge, and Language for Expert Systems

Understand the MRKL (Modular Reasoning, Knowledge, and Language) architecture that combines LLMs with specialized expert modules, intelligent routing, and structured knowledge retrieval for building powerful AI systems.

What Is MRKL?

MRKL — pronounced "miracle" — stands for Modular Reasoning, Knowledge, and Language. Introduced by Karpas et al. (2022), the MRKL architecture recognizes that no single neural model excels at everything. Instead, it pairs a large language model as a central router with a collection of specialized expert modules — calculators, databases, APIs, symbolic reasoners — each handling the tasks it does best.

Think of it like a hospital: the triage nurse (the LLM) evaluates your symptoms and routes you to the right specialist (an expert module). The nurse does not perform surgery, and the surgeon does not do triage.

Core Components

A MRKL system has three layers:

  1. Router — the LLM that interprets user queries and decides which expert to invoke
  2. Expert Modules — specialized tools or models (calculator, SQL engine, search API, etc.)
  3. Reasoning Chain — the logic that combines expert outputs into a coherent final answer
from dataclasses import dataclass
from typing import Callable, Any
from openai import OpenAI

client = OpenAI()

@dataclass
class ExpertModule:
    name: str
    description: str
    execute: Callable[[str], str]

class MRKLSystem:
    def __init__(self, experts: list[ExpertModule]):
        self.experts = {e.name: e for e in experts}

    def route(self, query: str) -> tuple[str, str]:
        """Use LLM to select the right expert and extract the sub-query."""
        expert_descriptions = "\n".join(
            f"- {e.name}: {e.description}" for e in self.experts.values()
        )

        response = client.chat.completions.create(
            model="gpt-4o",
            messages=[
                {"role": "system", "content": (
                    "You are a routing agent. Given a user query, select "
                    "the best expert module and extract the sub-query "
                    "for that expert.\n\n"
                    f"Available experts:\n{expert_descriptions}\n\n"
                    "Return JSON: {expert, sub_query}"
                )},
                {"role": "user", "content": query},
            ],
            response_format={"type": "json_object"},
        )
        import json
        data = json.loads(response.choices[0].message.content)
        return data["expert"], data["sub_query"]

Building Expert Modules

Each module handles a narrow domain. Here are some practical examples:

import math

def calculator_expert(expression: str) -> str:
    """Safely evaluate mathematical expressions."""
    allowed = set("0123456789+-*/().^ ")
    cleaned = expression.replace("^", "**")
    if not all(c in allowed for c in cleaned):
        return "Error: invalid characters in expression"
    try:
        result = eval(cleaned, {"__builtins__": {}}, {"math": math})
        return str(result)
    except Exception as e:
        return f"Calculation error: {e}"

def database_expert(sql_description: str) -> str:
    """Convert natural language to SQL and execute."""
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {"role": "system", "content": (
                "Convert the description to a PostgreSQL query. "
                "Only SELECT queries are allowed."
            )},
            {"role": "user", "content": sql_description},
        ],
    )
    sql = response.choices[0].message.content
    # Execute against actual DB connection in production
    return f"Generated SQL: {sql}"

experts = [
    ExpertModule("calculator", "Performs math calculations", calculator_expert),
    ExpertModule("database", "Queries structured data", database_expert),
]

The Reasoning Chain

After routing and execution, the system synthesizes the expert output into a final response:

See AI Voice Agents Handle Real Calls

Book a free demo or calculate how much you can save with AI voice automation.

def answer(self, query: str) -> str:
    expert_name, sub_query = self.route(query)
    expert = self.experts.get(expert_name)

    if not expert:
        return "No suitable expert found for this query."

    expert_output = expert.execute(sub_query)

    # Synthesize final answer
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {"role": "system", "content": (
                "Combine the expert's output with the original "
                "question to provide a clear, complete answer."
            )},
            {"role": "user", "content": (
                f"Question: {query}\n"
                f"Expert ({expert_name}) output: {expert_output}"
            )},
        ],
    )
    return response.choices[0].message.content

Multi-Expert Chaining

Complex queries often require multiple experts in sequence. For example, "What percentage of our revenue comes from customers in California?" needs the database expert first (to query revenue by state), then the calculator expert (to compute the percentage). The router must recognize this and chain calls accordingly.

MRKL vs Tool-Use Agents

Modern tool-use agents (like those built with OpenAI function calling) are essentially MRKL systems with a standardized interface. The MRKL paper laid the conceptual foundation — tools as expert modules, the LLM as the router. Understanding the MRKL framing helps you design better tool interfaces and routing logic.

FAQ

How is MRKL different from RAG?

RAG (Retrieval-Augmented Generation) is a specific pattern where the expert module is a document retriever. MRKL is a broader architecture — RAG is one possible expert within a MRKL system, alongside calculators, APIs, databases, and other specialists.

How do you handle routing errors?

Implement a fallback chain. If the selected expert returns an error or low-confidence result, route to the next most likely expert. You can also ask the LLM to select its top 3 experts ranked by relevance, then try them in order.

Can you use different LLMs for routing vs synthesis?

Absolutely. A smaller, faster model (GPT-4o-mini) can handle routing since the task is classification-like. Reserve the larger model for the synthesis step where nuanced reasoning matters most.


#MRKL #ModularAI #ExpertSystems #AIArchitecture #AgenticAI #KnowledgeRetrieval #PythonAI #ToolUse

Share this article
C

CallSphere Team

Expert insights on AI voice agents and customer communication automation.

Try CallSphere AI Voice Agents

See how AI voice agents work for your industry. Live demo available -- no signup required.