What Is Extended Thinking?

Extended thinking is a mode where Claude allocates additional computation to reasoning before it starts producing output or taking actions. In standard mode, Claude begins generating a response immediately. In extended thinking mode, Claude first produces an internal chain of thought — analyzing the problem, considering alternatives, planning its approach — before committing to a course of action.

In Claude Code, extended thinking is particularly valuable because the stakes of each action are higher. A poorly reasoned Edit or Bash command can break your codebase. Extended thinking reduces the chance of false starts and wrong turns.

How Extended Thinking Works in Claude Code

When extended thinking is active, Claude Code's behavior changes:

flowchart TD
    START["Getting the Most from Claude Code's Extended Thin…"] --> A
    A["What Is Extended Thinking?"]
    A --> B
    B["How Extended Thinking Works in Claude C…"]
    B --> C
    C["When Extended Thinking Shines"]
    C --> D
    D["Extended Thinking vs. Standard Mode: Wh…"]
    D --> E
    E["Reading the Thinking Output"]
    E --> F
    F["Prompting Strategies for Extended Think…"]
    F --> G
    G["Cost Considerations"]
    G --> H
    H["Conclusion"]
    H --> DONE["Key Takeaways"]
    style START fill:#4f46e5,stroke:#4338ca,color:#fff
    style DONE fill:#059669,stroke:#047857,color:#fff

Before the first tool call, Claude produces a thinking block (visible in verbose mode) where it analyzes the request, considers the codebase structure, and plans its approach
Between tool calls, Claude may think through the implications of what it has observed before deciding the next step
The thinking is visible — you can see Claude's reasoning process, which helps you understand and verify its approach

Enabling Extended Thinking

Extended thinking is controlled by the model selection and prompt complexity. Claude Code with Opus models uses extended thinking automatically for complex tasks. You can also influence it:

Think carefully about this before making changes: [your complex request]

Or in headless mode:

claude -p "Think step by step about how to refactor the payment module to support multiple payment providers" --model opus

When Extended Thinking Shines

1. Architecture Decisions

Standard mode might jump straight to implementing. Extended thinking evaluates tradeoffs first.

Think carefully about the best approach: We need to add real-time notifications
to our app. Options include WebSockets, Server-Sent Events, and polling.
Our stack is Next.js frontend, FastAPI backend, deployed on Kubernetes.
Consider scalability, complexity, and our existing infrastructure.

With extended thinking, Claude Code reasons through:

WebSocket implications for Kubernetes (sticky sessions, horizontal scaling)
SSE simplicity but unidirectional limitation
Polling's simplicity but resource waste
How each option integrates with FastAPI and Next.js
Infrastructure changes required for each approach

This produces a recommendation with clear reasoning, not just an implementation of the first approach that comes to mind.

2. Complex Debugging

When a bug involves multiple interacting systems, extended thinking helps Claude Code trace the full causality chain:

See AI Voice Agents Handle Real Calls

Book a free demo or calculate how much you can save with AI voice automation.

Try Live Demo ROI Calculator

Think carefully about this bug: Users report that after changing their email,
they cannot log in for about 5 minutes. After 5 minutes, login works again.
Our auth system uses JWT tokens with email in the payload, and we cache
user sessions in Redis with a 5-minute TTL.

Extended thinking traces:

Email change updates the database immediately
JWT tokens in flight still contain the old email
The Redis session cache stores the old email
Login verification checks the JWT email against the database
The 5-minute window matches the Redis TTL

This leads to the correct diagnosis: the session cache needs to be invalidated when the email changes, not just when it expires.

3. Multi-File Refactoring Planning

Before touching any files, extended thinking plans the entire refactoring:

Think carefully about the refactoring plan: Convert our Express.js API from
callbacks to async/await. The codebase has 45 route files, 12 middleware
files, and 8 service files. Plan the migration order and identify dependencies.

Extended thinking produces:

Dependency graph of modules
Correct migration order (bottom-up: services first, then middleware, then routes)
Risk assessment for each category
Testing strategy at each phase
Rollback plan if issues arise

4. Security Analysis

Security requires thinking about all possible attack vectors:

Think carefully about the security implications: Review our authentication
flow for vulnerabilities. The flow is: login form -> POST /auth/login ->
JWT issued -> stored in httpOnly cookie -> sent with every request ->
validated by middleware -> refresh via POST /auth/refresh.

Extended thinking methodically checks:

Token storage security (httpOnly cookie: good)
CSRF protection (cookie-based auth needs CSRF tokens)
Token expiration and refresh token rotation
Logout invalidation (are tokens blacklisted?)
Brute force protection on login endpoint
Token payload contents (sensitive data exposure?)

Extended Thinking vs. Standard Mode: When to Use Each

Scenario	Recommended Mode	Why
Simple bug fix	Standard	The fix is usually obvious once the bug is found
Adding a CRUD endpoint	Standard	Well-defined, pattern-following task
Architecture decision	Extended	Needs tradeoff analysis
Complex debugging	Extended	Needs causal chain tracing
Security review	Extended	Needs systematic threat analysis
Large refactoring plan	Extended	Needs dependency analysis and ordering
Writing tests	Standard	Tests follow predictable patterns
Code review	Extended	Needs thorough examination of edge cases
Simple file edits	Standard	Minimal reasoning needed
Multi-service changes	Extended	Needs understanding of service interactions

Reading the Thinking Output

When verbose mode is enabled (claude -v), you can see the thinking blocks. This is valuable for:

flowchart TD
    ROOT["Getting the Most from Claude Code's Extended…"] 
    ROOT --> P0["How Extended Thinking Works in Claude C…"]
    P0 --> P0C0["Enabling Extended Thinking"]
    ROOT --> P1["When Extended Thinking Shines"]
    P1 --> P1C0["1. Architecture Decisions"]
    P1 --> P1C1["2. Complex Debugging"]
    P1 --> P1C2["3. Multi-File Refactoring Planning"]
    P1 --> P1C3["4. Security Analysis"]
    ROOT --> P2["Prompting Strategies for Extended Think…"]
    P2 --> P2C0["Be Explicit About Wanting Analysis"]
    P2 --> P2C1["Ask for a Plan First"]
    P2 --> P2C2["Request Risk Assessment"]
    P2 --> P2C3["Chain Thinking Into Action"]
    style ROOT fill:#4f46e5,stroke:#4338ca,color:#fff
    style P0 fill:#e0e7ff,stroke:#6366f1,color:#1e293b
    style P1 fill:#e0e7ff,stroke:#6366f1,color:#1e293b
    style P2 fill:#e0e7ff,stroke:#6366f1,color:#1e293b

Verifying the approach — Is Claude Code reasoning about the right things?
Catching wrong assumptions — If the thinking mentions a wrong assumption about your codebase, you can correct it
Learning — Claude Code's reasoning often reveals insights about your codebase that you might not have considered

Example thinking output:

[Thinking]
The user wants to add caching to the product listing endpoint. Let me consider:

1. Current endpoint reads from PostgreSQL on every request
2. Product data changes infrequently (maybe a few times per day)
3. The CLAUDE.md mentions Redis is available at redis://cache:6379

Approach options:
a) Redis cache with TTL — simple, effective for this use case
b) HTTP cache headers — good for CDN but doesn't reduce DB load for authenticated requests
c) In-memory cache — simple but doesn't share across pods in K8s

Given that they run on Kubernetes (mentioned in CLAUDE.md), option (a) is best
because it shares the cache across all pods. I'll use a 5-minute TTL and
invalidate on product updates.

Let me check the existing caching patterns in the codebase first...

Prompting Strategies for Extended Thinking

Be Explicit About Wanting Analysis

Before implementing anything, analyze the current codebase and propose
an approach. Explain the tradeoffs of different solutions.

Ask for a Plan First

Create a detailed plan for migrating from REST to GraphQL.
Do not make any code changes yet — just produce the plan.

Request Risk Assessment

What could go wrong with this approach? What edge cases might we miss?
What are the failure modes?

Chain Thinking Into Action

Phase 1: Analyze the codebase and create a migration plan (think carefully)
Phase 2: Implement the plan step by step (execute)
Phase 3: Review what you implemented for issues (think carefully again)

Cost Considerations

Extended thinking uses more tokens because the thinking blocks count as output tokens. For Claude Opus 4.6:

flowchart LR
    S0["1. Architecture Decisions"]
    S0 --> S1
    S1["2. Complex Debugging"]
    S1 --> S2
    S2["3. Multi-File Refactoring Planning"]
    S2 --> S3
    S3["4. Security Analysis"]
    style S0 fill:#4f46e5,stroke:#4338ca,color:#fff
    style S3 fill:#059669,stroke:#047857,color:#fff

Standard task (10 tool calls): ~$0.15-0.30
Same task with extended thinking: ~$0.25-0.50

The additional cost is usually worth it for complex tasks where a wrong start wastes more time and tokens than the thinking overhead.

Conclusion

Extended thinking transforms Claude Code from a fast-but-sometimes-impulsive coder into a deliberate, analytical problem solver. Use it for architecture decisions, complex debugging, security reviews, and refactoring plans — tasks where thinking before acting prevents costly mistakes. For routine coding tasks, standard mode remains faster and more cost-effective. The key is matching the thinking depth to the task complexity.

Getting the Most from Claude Code's Extended Thinking Mode