Configuration-as-Code for AI Agents: YAML, TOML, and Python Config Patterns

Why Configuration-as-Code

Storing agent configuration in code — version-controlled config files rather than database rows or UI settings — brings the full power of software engineering to agent management. You get git history showing who changed what, pull request reviews for configuration changes, automated validation in CI, and deterministic deployments where the same commit always produces the same agent behavior.

The question is which format to use. YAML, TOML, and Python each have distinct tradeoffs for agent configuration.

YAML Configuration

YAML is the most common format in the cloud-native ecosystem. Its strength is readability and support for complex nested structures.

# agent_config.yaml loaded by the application
YAML_EXAMPLE = """
agent:
  name: support-agent
  model: gpt-4o
  temperature: 0.7
  max_tokens: 2048
  system_prompt: |
    You are a customer support agent for Acme Corp.
    Always be polite and professional.
    If you cannot resolve an issue, escalate to a human agent.

  tools:
    - name: search_docs
      description: Search the knowledge base
      enabled: true
    - name: create_ticket
      description: Create a support ticket
      enabled: true
    - name: refund_order
      description: Process a refund
      enabled: false
      requires_approval: true

  guardrails:
    max_tool_calls_per_turn: 3
    block_pii_in_responses: true
    escalation_keywords:
      - "speak to a human"
      - "supervisor"
      - "complaint"
"""

import yaml


def load_yaml_config(path: str) -> dict:
    with open(path, "r") as f:
        config = yaml.safe_load(f)
    return config

The critical detail here is yaml.safe_load. Never use yaml.load with untrusted input — it can execute arbitrary Python code. safe_load restricts parsing to basic data types.

TOML Configuration

TOML is more explicit than YAML and avoids its indentation pitfalls. It is the standard for Python packaging (pyproject.toml) and has first-class support in Python 3.11 and later via tomllib.

TOML_EXAMPLE = """
[agent]
name = "support-agent"
model = "gpt-4o"
temperature = 0.7
max_tokens = 2048

system_prompt = '''
You are a customer support agent for Acme Corp.
Always be polite and professional.
If you cannot resolve an issue, escalate to a human agent.
'''

[guardrails]
max_tool_calls_per_turn = 3
block_pii_in_responses = true
escalation_keywords = ["speak to a human", "supervisor", "complaint"]

[[tools]]
name = "search_docs"
description = "Search the knowledge base"
enabled = true

[[tools]]
name = "create_ticket"
description = "Create a support ticket"
enabled = true
"""

try:
    import tomllib
except ImportError:
    import tomli as tomllib


def load_toml_config(path: str) -> dict:
    with open(path, "rb") as f:
        return tomllib.load(f)

TOML's advantage is unambiguous typing. In YAML, yes, on, true are all boolean true. In TOML, only true is boolean. This eliminates an entire class of subtle configuration bugs.

See AI Voice Agents Handle Real Calls

Book a free demo or calculate how much you can save with AI voice automation.

Book a Demo ROI Calculator

Python Configuration

Python config files offer maximum flexibility. You get type checking, computed values, and validation built into the config definition itself.

from pydantic import BaseModel, field_validator
from typing import Optional


class ToolConfig(BaseModel):
    name: str
    description: str
    enabled: bool = True
    requires_approval: bool = False


class GuardrailConfig(BaseModel):
    max_tool_calls_per_turn: int = 3
    block_pii_in_responses: bool = True
    escalation_keywords: list[str] = []

    @field_validator("max_tool_calls_per_turn")
    @classmethod
    def validate_max_calls(cls, v: int) -> int:
        if not 1 <= v <= 20:
            raise ValueError("max_tool_calls_per_turn must be 1-20")
        return v


class AgentConfig(BaseModel):
    name: str
    model: str = "gpt-4o"
    temperature: float = 0.7
    max_tokens: int = 2048
    system_prompt: str
    tools: list[ToolConfig] = []
    guardrails: GuardrailConfig = GuardrailConfig()

    @field_validator("temperature")
    @classmethod
    def validate_temp(cls, v: float) -> float:
        if not 0.0 <= v <= 2.0:
            raise ValueError("Temperature must be 0.0-2.0")
        return v

Default Merging

A common pattern is merging user-provided config with defaults. The user only specifies what they want to change.

from copy import deepcopy


def deep_merge(base: dict, override: dict) -> dict:
    result = deepcopy(base)
    for key, value in override.items():
        if (
            key in result
            and isinstance(result[key], dict)
            and isinstance(value, dict)
        ):
            result[key] = deep_merge(result[key], value)
        else:
            result[key] = deepcopy(value)
    return result


DEFAULTS = {
    "agent": {
        "model": "gpt-4o-mini",
        "temperature": 0.7,
        "max_tokens": 1024,
    },
    "guardrails": {
        "max_tool_calls_per_turn": 3,
        "block_pii_in_responses": True,
    },
}


def load_with_defaults(config_path: str) -> dict:
    user_config = load_toml_config(config_path)
    return deep_merge(DEFAULTS, user_config)

Unified Config Loader

In practice, you want a single loader that handles any format and validates the result.

from pathlib import Path


class ConfigLoader:
    LOADERS = {
        ".yaml": lambda p: yaml.safe_load(open(p)),
        ".yml": lambda p: yaml.safe_load(open(p)),
        ".toml": lambda p: tomllib.load(open(p, "rb")),
        ".json": lambda p: json.load(open(p)),
    }

    @classmethod
    def load(cls, path: str) -> AgentConfig:
        p = Path(path)
        loader = cls.LOADERS.get(p.suffix)
        if not loader:
            raise ValueError(f"Unsupported config format: {p.suffix}")

        raw = loader(path)
        merged = deep_merge(DEFAULTS, raw)

        agent_data = merged.get("agent", {})
        agent_data["guardrails"] = merged.get("guardrails", {})
        agent_data["tools"] = merged.get("tools", [])

        return AgentConfig(**agent_data)

Format Comparison

Use YAML when your team is already in the Kubernetes ecosystem and familiar with its conventions. Use TOML when you want strict, unambiguous typing and your config is relatively flat. Use Python configs when you need computed values, complex validation, or type safety throughout. For most AI agent projects, TOML combined with Pydantic validation offers the best balance of readability and safety.

FAQ

How do I handle multi-line system prompts in TOML?

TOML supports multi-line strings with triple quotes. Use single-quoted triple quotes (''') for literal strings where backslashes are not interpreted as escapes. This is ideal for system prompts that may contain special characters.

Should I validate config files in CI?

Absolutely. Add a CI step that loads every config file through your validation layer. This catches typos, invalid values, and missing required fields before they reach any environment. The validation step should take less than a second and prevents entire classes of deployment failures.

When should I avoid configuration-as-code?

When configurations change frequently (multiple times per day) and are managed by non-technical users. In that case, a database-backed config with an admin UI is more appropriate. Configuration-as-code works best for settings that change with releases and are managed by the engineering team.

#ConfigurationAsCode #AIAgents #YAML #TOML #Python #AgenticAI #LearnAI #AIEngineering

Configuration-as-Code for AI Agents: YAML, TOML, and Python Config Patterns

Why Configuration-as-Code

YAML Configuration

TOML Configuration

Python Configuration

Default Merging

Unified Config Loader

Format Comparison

FAQ

How do I handle multi-line system prompts in TOML?

Should I validate config files in CI?

When should I avoid configuration-as-code?

Try CallSphere AI Voice Agents

Related Articles

WebArena and Real-World Web Agent Benchmarks: How We Measure Browser Agent Performance

Taking Screenshots and Recording Videos with Playwright for AI Analysis

Playwright Selectors Deep Dive: CSS, XPath, Text, and Role-Based Element Finding