Building Production AI Agents with Claude Code CLI: From Setup to Deployment

Claude Code: The Agent That Builds Agents

Claude Code is Anthropic's agentic coding tool — a CLI application that operates directly in your terminal, understands your codebase, and can read files, write code, execute commands, search the web, and manage complex multi-step tasks autonomously. Unlike chat-based AI assistants, Claude Code operates as a genuine agent: it plans, executes, evaluates, and iterates.

But Claude Code is not just a tool for writing code faster. It is a platform for building AI agent systems. Through its extensibility mechanisms — hooks, MCP servers, custom commands, and the Claude Code SDK — you can use Claude Code as the foundation for production agent architectures that go far beyond interactive coding assistance.

This guide covers the practical patterns for using Claude Code to build, test, and deploy production AI agents.

Setup and Configuration

Getting started with Claude Code requires an Anthropic API key and a terminal. The CLI installs via npm and runs in any Unix-like environment.

# Install Claude Code
npm install -g @anthropic-ai/claude-code

# Verify installation
claude --version

# Start an interactive session
claude

# Or run a single command
claude -p "Explain the architecture of this project"

For production use, configure Claude Code through the settings file and project-level configuration.

# Project-level configuration: .claude/settings.json
cat > .claude/settings.json << 'SETTINGS'
{
  "model": "claude-opus-4-6-20260301",
  "maxTurns": 30,
  "systemPrompt": "You are a senior engineer working on this project. Follow existing patterns and conventions. Write production-quality code with error handling and tests.",
  "allowedTools": [
    "Read",
    "Write",
    "Edit",
    "Bash",
    "Grep",
    "Glob"
  ],
  "permissions": {
    "allow": ["Read", "Grep", "Glob"],
    "deny": []
  }
}
SETTINGS

The permissions system controls which tools Claude Code can use without asking for confirmation. For automated (non-interactive) agent pipelines, you will typically allow all tools and rely on hooks for safety guardrails.

Hooks: Intercepting Agent Actions

Hooks are the most powerful extensibility mechanism in Claude Code. They let you run custom code before or after specific agent actions — tool calls, model responses, notifications, and session lifecycle events. Hooks are defined in your project's settings and execute as subprocesses.

# .claude/settings.json with hooks
cat > .claude/settings.json << 'HOOKS'
{
  "hooks": {
    "PreToolUse": [
      {
        "matcher": "Bash",
        "hook": ".claude/hooks/validate-bash-command.sh"
      },
      {
        "matcher": "Write",
        "hook": ".claude/hooks/validate-file-write.sh"
      }
    ],
    "PostToolUse": [
      {
        "matcher": "Bash",
        "hook": ".claude/hooks/log-command-execution.sh"
      }
    ],
    "Notification": [
      {
        "hook": ".claude/hooks/send-slack-notification.sh"
      }
    ]
  }
}
HOOKS

The hook receives a JSON payload on stdin with details about the action, and can return a JSON response to modify, approve, or reject the action.

#!/usr/bin/env python3
# .claude/hooks/validate-bash-command.py
# PreToolUse hook that blocks dangerous commands

import json
import sys

def main():
    payload = json.loads(sys.stdin.read())
    tool_name = payload.get("tool_name", "")
    tool_input = payload.get("tool_input", {})

    if tool_name != "Bash":
        # Not a bash command — allow
        print(json.dumps({"decision": "approve"}))
        return

    command = tool_input.get("command", "")

    # Block dangerous patterns
    blocked_patterns = [
        "rm -rf /",
        "rm -rf ~",
        "DROP DATABASE",
        "DROP TABLE",
        "> /dev/sda",
        "mkfs",
        "dd if=",
        ":(){ :|:& };:",
        "chmod -R 777 /",
        "curl | bash",
        "wget | bash",
    ]

    for pattern in blocked_patterns:
        if pattern.lower() in command.lower():
            print(json.dumps({
                "decision": "reject",
                "reason": f"Blocked dangerous command pattern: {pattern}",
            }))
            return

    # Block commands that modify production
    if "kubectl" in command and any(
        kw in command for kw in ["delete", "apply", "scale"]
    ):
        if "--namespace=production" in command or "-n production" in command:
            print(json.dumps({
                "decision": "reject",
                "reason": "Production namespace modifications require "
                          "manual approval. Run this command yourself.",
            }))
            return

    print(json.dumps({"decision": "approve"}))

if __name__ == "__main__":
    main()

Hooks enable you to build safety guardrails that are enforced at the tool level, not just the prompt level. A prompt-level instruction ("don't delete production databases") can be overridden by sufficiently persuasive user input. A hook-level guardrail cannot — it operates outside the model's control.

MCP Servers: Extending Claude Code's Capabilities

Claude Code natively supports MCP servers, which means you can give it access to any external system through the MCP protocol. This is how you connect Claude Code to your databases, APIs, monitoring systems, and internal tools.

See AI Voice Agents Handle Real Calls

Book a free demo or calculate how much you can save with AI voice automation.

Book a Demo ROI Calculator

# .claude/settings.json with MCP servers
cat > .claude/settings.json << 'MCP_CONFIG'
{
  "mcpServers": {
    "github": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-github"],
      "env": {
        "GITHUB_TOKEN": "your-token-here"
      }
    },
    "postgres": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-postgres"],
      "env": {
        "DATABASE_URL": "postgresql://user:pass@localhost/mydb"
      }
    },
    "internal-tools": {
      "command": "node",
      "args": [".claude/mcp-servers/internal-tools.js"]
    }
  }
}
MCP_CONFIG

With MCP servers configured, Claude Code can discover and use the tools they expose. For example, with the GitHub MCP server, Claude Code can search repositories, read files, create pull requests, and review code — all through the standardized MCP interface.

Building a custom MCP server for your internal tools is straightforward.

// .claude/mcp-servers/internal-tools.ts
import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
import { z } from "zod";

const server = new McpServer({
  name: "internal-tools",
  version: "1.0.0",
});

// Deploy to staging environment
server.tool(
  "deploy_staging",
  "Deploy the current branch to the staging environment",
  {
    service: z.string().describe("Service name to deploy"),
    tag: z.string().describe("Docker image tag to deploy"),
  },
  async ({ service, tag }) => {
    // Call internal deployment API
    const response = await fetch(
      "https://deploy.internal.company.com/api/deploy",
      {
        method: "POST",
        headers: {
          "Content-Type": "application/json",
          Authorization: `Bearer ${process.env.DEPLOY_TOKEN}`,
        },
        body: JSON.stringify({
          service,
          tag,
          environment: "staging",  // Hardcoded — never allow prod
        }),
      }
    );

    const result = await response.json();

    return {
      content: [{
        type: "text" as const,
        text: JSON.stringify(result, null, 2),
      }],
    };
  }
);

// Query application logs
server.tool(
  "search_logs",
  "Search application logs in Elasticsearch",
  {
    query: z.string().describe("Log search query"),
    service: z.string().describe("Service name"),
    time_range: z.string().default("1h").describe("Time range (1h, 24h, 7d)"),
    level: z.enum(["error", "warn", "info", "debug"]).optional(),
    limit: z.number().max(100).default(20),
  },
  async ({ query, service, time_range, level, limit }) => {
    const esQuery = buildElasticsearchQuery(
      query, service, time_range, level, limit
    );

    const response = await fetch(
      `${process.env.ES_URL}/logs-*/_search`,
      {
        method: "POST",
        headers: { "Content-Type": "application/json" },
        body: JSON.stringify(esQuery),
      }
    );

    const result = await response.json();
    const logs = result.hits.hits.map((hit: any) => ({
      timestamp: hit._source["@timestamp"],
      level: hit._source.level,
      message: hit._source.message,
      service: hit._source.service,
    }));

    return {
      content: [{
        type: "text" as const,
        text: JSON.stringify(logs, null, 2),
      }],
    };
  }
);

async function main() {
  const transport = new StdioServerTransport();
  await server.connect(transport);
}

main().catch(console.error);

The Claude Code SDK: Programmatic Agent Control

The Claude Code SDK allows you to use Claude Code programmatically from your own applications. This is the foundation for building custom agent systems that leverage Claude Code's capabilities (file editing, code execution, codebase understanding) without requiring interactive terminal sessions.

// Using the Claude Code SDK for automated code review
import { ClaudeCode } from "@anthropic-ai/claude-code";

async function automatedCodeReview(prDiff: string): Promise<{
  summary: string;
  issues: Array<{ file: string; line: number; severity: string; message: string }>;
  approved: boolean;
}> {
  const claude = new ClaudeCode({
    model: "claude-sonnet-4-6-20260301",
    maxTurns: 10,
    systemPrompt: `You are a senior code reviewer. Analyze the provided
    diff and identify:
    1. Security vulnerabilities
    2. Performance issues
    3. Logic errors
    4. Missing error handling
    5. Style inconsistencies with the existing codebase

    Be specific about file names and line numbers. Only flag real
    issues — do not nitpick style preferences.`,
  });

  const result = await claude.run({
    prompt: `Review this pull request diff:\n\n${prDiff}\n\n
    After reviewing, output your findings as JSON with this structure:
    {
      "summary": "one paragraph summary",
      "issues": [{"file": "...", "line": N, "severity": "critical|high|medium|low", "message": "..."}],
      "approved": true/false
    }`,
    tools: ["Read", "Grep", "Glob"],  // Allow reading existing code
  });

  return JSON.parse(result.output);
}

// Integrate into CI/CD pipeline
async function runInCI() {
  const diff = await exec("git diff origin/main...HEAD");
  const review = await automatedCodeReview(diff);

  console.log(`Review summary: ${review.summary}`);
  console.log(`Issues found: ${review.issues.length}`);

  if (review.issues.some((i) => i.severity === "critical")) {
    console.error("Critical issues found — blocking merge");
    process.exit(1);
  }

  if (review.approved) {
    console.log("Code review passed");
  } else {
    console.warn("Code review flagged issues — human review recommended");
  }
}

Parallel Agents: Scaling with Multiple Claude Code Instances

For tasks that can be parallelized — reviewing multiple files, generating tests for multiple modules, analyzing different subsystems — you can run multiple Claude Code instances in parallel using the SDK.

# Parallel agent execution with Claude Code SDK
import asyncio
import subprocess
import json

async def run_claude_code_task(task: dict) -> dict:
    """Run a single Claude Code task as a subprocess."""
    proc = await asyncio.create_subprocess_exec(
        "claude", "-p", task["prompt"],
        "--output-format", "json",
        "--max-turns", str(task.get("max_turns", 10)),
        stdout=asyncio.subprocess.PIPE,
        stderr=asyncio.subprocess.PIPE,
        cwd=task.get("cwd", "."),
    )
    stdout, stderr = await proc.communicate()
    return {
        "task_id": task["id"],
        "output": json.loads(stdout) if stdout else None,
        "error": stderr.decode() if stderr else None,
    }

async def parallel_test_generation(modules: list[str]):
    """Generate tests for multiple modules in parallel."""
    tasks = [
        {
            "id": f"test-{module}",
            "prompt": (
                f"Read the module at {module} and generate a comprehensive "
                f"test suite. Write the tests to {module.replace('.py', '_test.py')}. "
                f"Include edge cases and error scenarios."
            ),
            "max_turns": 15,
        }
        for module in modules
    ]

    # Run up to 5 agents in parallel
    semaphore = asyncio.Semaphore(5)

    async def bounded_task(task):
        async with semaphore:
            return await run_claude_code_task(task)

    results = await asyncio.gather(
        *[bounded_task(t) for t in tasks]
    )

    successful = sum(1 for r in results if r["error"] is None)
    print(f"Generated tests for {successful}/{len(modules)} modules")
    return results

# Usage
modules = [
    "src/auth/middleware.py",
    "src/billing/processor.py",
    "src/notifications/email.py",
    "src/api/routes.py",
    "src/database/queries.py",
]
asyncio.run(parallel_test_generation(modules))

Production Deployment Patterns

For deploying Claude Code-powered agents in production, several patterns have proven effective.

CI/CD Integration

The most common production use is integrating Claude Code into CI/CD pipelines for automated code review, test generation, documentation updates, and migration assistance.

#!/bin/bash
# .github/workflows/ai-review.yml equivalent in bash
# Run Claude Code as part of the CI pipeline

set -euo pipefail

# Get the PR diff
DIFF=$(git diff origin/main...HEAD)

# Run automated review
REVIEW=$(claude -p "Review this diff for security and correctness issues. Output JSON with {issues: [{file, line, severity, message}], pass: boolean}:

$DIFF" --output-format json --max-turns 5)

# Parse results
PASS=$(echo "$REVIEW" | jq -r '.pass')
ISSUE_COUNT=$(echo "$REVIEW" | jq '.issues | length')

echo "Issues found: $ISSUE_COUNT"

if [ "$PASS" = "false" ]; then
    echo "AI review failed — posting comments to PR"
    echo "$REVIEW" | jq -r '.issues[] | "- [(.severity)] (.file):(.line) — (.message)"'
    exit 1
fi

echo "AI review passed"

Scheduled Tasks

Claude Code can run scheduled tasks: daily codebase health checks, weekly dependency audits, automated changelog generation.

# Cron job: daily security scan
# 0 6 * * * /opt/agents/daily-security-scan.sh

#!/bin/bash
set -euo pipefail

cd /opt/app

REPORT=$(claude -p "Scan this codebase for security vulnerabilities. Check for:
1. Hardcoded secrets or API keys
2. SQL injection vulnerabilities
3. XSS vulnerabilities in templates
4. Insecure dependency versions
5. Missing authentication checks on API routes

Output a JSON report with {findings: [{severity, file, description}], critical_count: N}"   --output-format json --max-turns 15)

CRITICAL=$(echo "$REPORT" | jq '.critical_count')

if [ "$CRITICAL" -gt 0 ]; then
    # Send alert
    curl -X POST "$SLACK_WEBHOOK"       -H "Content-Type: application/json"       -d "{"text": "Security scan found $CRITICAL critical issues. Review: $REPORT"}"
fi

# Archive report
echo "$REPORT" > "/opt/reports/security-$(date +%Y-%m-%d).json"

CLAUDE.md: Project Knowledge

The CLAUDE.md file at the root of your project is Claude Code's project knowledge base. It is automatically loaded into context at the start of every session. Use it to document project conventions, architectural decisions, and operational guidelines that every agent session should know.

# Example CLAUDE.md for a production project
cat > CLAUDE.md << 'CLAUDEMD'
# Project: Order Management Service

## Architecture
- FastAPI backend with SQLAlchemy ORM
- PostgreSQL database with Alembic migrations
- Redis for caching and session storage
- Deployed on Kubernetes (k3s) with hostPath volumes

## Conventions
- Use snake_case for Python, camelCase for TypeScript
- All API endpoints require authentication via JWT
- Database queries use SQLAlchemy ORM (no raw SQL)
- Tests use pytest with async fixtures

## Critical Rules
- NEVER modify migration files that have been applied to production
- NEVER expose internal error details in API responses
- ALWAYS use parameterized queries (no string formatting in SQL)
- ALWAYS add database indexes for new foreign key columns

## Deployment
- Code changes auto-reload (uvicorn --reload + hostPath volumes)
- Only restart pods for: new dependencies, env var changes, build config
- Run `alembic upgrade head` after adding migrations
CLAUDEMD

FAQ

Can Claude Code run in headless mode for production pipelines?

Yes. The -p flag runs Claude Code in non-interactive (print) mode, which is suitable for CI/CD pipelines and automated tasks. Combined with --output-format json, it produces structured output that can be parsed by downstream automation. For long-running tasks, use --max-turns to set an upper bound on agent iterations and --timeout to set a wall-clock time limit.

How do I manage costs when running multiple Claude Code agents?

Track costs through the Anthropic API dashboard and set budget limits. Each Claude Code session is a series of API calls — monitor token usage per session. Use Sonnet 4.6 for routine tasks (test generation, code formatting, documentation) and reserve Opus 4.6 for complex tasks (architecture decisions, security reviews). The hooks system can enforce model selection based on task type.

Is Claude Code suitable for production agent systems, or is it just a developer tool?

Claude Code started as a developer tool but the SDK and hooks system make it suitable for production agent pipelines. The key consideration is that Claude Code runs as a subprocess — for high-throughput production systems (thousands of concurrent agents), you may want the Anthropic API directly with your own orchestration layer. Claude Code is ideal for medium-throughput use cases: CI/CD pipelines, scheduled tasks, internal tools, and developer-facing agents.

How do hooks compare to model-level guardrails?

Hooks operate at the infrastructure level — they intercept tool calls before execution and cannot be circumvented by the model. Model-level guardrails (system prompt instructions) operate at the prompt level and can theoretically be overridden through prompt injection. For security-critical constraints (never delete production data, never deploy without tests), use hooks. For quality guidelines (follow code conventions, write comprehensive docstrings), system prompt instructions are sufficient.

#ClaudeCode #CLI #AIAgents #Development #Production #MCPServers #Hooks #AgentPipelines #Anthropic