Skip to content
Back to Blog
Agentic AI6 min read

Getting the Most from Claude Code's Extended Thinking Mode

How Claude Code's extended thinking mode works, when to use it, how it improves complex reasoning, and practical tips for architecture, debugging, and refactoring tasks.

What Is Extended Thinking?

Extended thinking is a mode where Claude allocates additional computation to reasoning before it starts producing output or taking actions. In standard mode, Claude begins generating a response immediately. In extended thinking mode, Claude first produces an internal chain of thought — analyzing the problem, considering alternatives, planning its approach — before committing to a course of action.

In Claude Code, extended thinking is particularly valuable because the stakes of each action are higher. A poorly reasoned Edit or Bash command can break your codebase. Extended thinking reduces the chance of false starts and wrong turns.

How Extended Thinking Works in Claude Code

When extended thinking is active, Claude Code's behavior changes:

  1. Before the first tool call, Claude produces a thinking block (visible in verbose mode) where it analyzes the request, considers the codebase structure, and plans its approach
  2. Between tool calls, Claude may think through the implications of what it has observed before deciding the next step
  3. The thinking is visible — you can see Claude's reasoning process, which helps you understand and verify its approach

Enabling Extended Thinking

Extended thinking is controlled by the model selection and prompt complexity. Claude Code with Opus models uses extended thinking automatically for complex tasks. You can also influence it:

Think carefully about this before making changes: [your complex request]

Or in headless mode:

claude -p "Think step by step about how to refactor the payment module to support multiple payment providers" --model opus

When Extended Thinking Shines

1. Architecture Decisions

Standard mode might jump straight to implementing. Extended thinking evaluates tradeoffs first.

Think carefully about the best approach: We need to add real-time notifications
to our app. Options include WebSockets, Server-Sent Events, and polling.
Our stack is Next.js frontend, FastAPI backend, deployed on Kubernetes.
Consider scalability, complexity, and our existing infrastructure.

With extended thinking, Claude Code reasons through:

  • WebSocket implications for Kubernetes (sticky sessions, horizontal scaling)
  • SSE simplicity but unidirectional limitation
  • Polling's simplicity but resource waste
  • How each option integrates with FastAPI and Next.js
  • Infrastructure changes required for each approach

This produces a recommendation with clear reasoning, not just an implementation of the first approach that comes to mind.

2. Complex Debugging

When a bug involves multiple interacting systems, extended thinking helps Claude Code trace the full causality chain:

Think carefully about this bug: Users report that after changing their email,
they cannot log in for about 5 minutes. After 5 minutes, login works again.
Our auth system uses JWT tokens with email in the payload, and we cache
user sessions in Redis with a 5-minute TTL.

Extended thinking traces:

  • Email change updates the database immediately
  • JWT tokens in flight still contain the old email
  • The Redis session cache stores the old email
  • Login verification checks the JWT email against the database
  • The 5-minute window matches the Redis TTL

This leads to the correct diagnosis: the session cache needs to be invalidated when the email changes, not just when it expires.

3. Multi-File Refactoring Planning

Before touching any files, extended thinking plans the entire refactoring:

Think carefully about the refactoring plan: Convert our Express.js API from
callbacks to async/await. The codebase has 45 route files, 12 middleware
files, and 8 service files. Plan the migration order and identify dependencies.

Extended thinking produces:

  • Dependency graph of modules
  • Correct migration order (bottom-up: services first, then middleware, then routes)
  • Risk assessment for each category
  • Testing strategy at each phase
  • Rollback plan if issues arise

4. Security Analysis

Security requires thinking about all possible attack vectors:

Think carefully about the security implications: Review our authentication
flow for vulnerabilities. The flow is: login form -> POST /auth/login ->
JWT issued -> stored in httpOnly cookie -> sent with every request ->
validated by middleware -> refresh via POST /auth/refresh.

Extended thinking methodically checks:

  • Token storage security (httpOnly cookie: good)
  • CSRF protection (cookie-based auth needs CSRF tokens)
  • Token expiration and refresh token rotation
  • Logout invalidation (are tokens blacklisted?)
  • Brute force protection on login endpoint
  • Token payload contents (sensitive data exposure?)

Extended Thinking vs. Standard Mode: When to Use Each

Scenario Recommended Mode Why
Simple bug fix Standard The fix is usually obvious once the bug is found
Adding a CRUD endpoint Standard Well-defined, pattern-following task
Architecture decision Extended Needs tradeoff analysis
Complex debugging Extended Needs causal chain tracing
Security review Extended Needs systematic threat analysis
Large refactoring plan Extended Needs dependency analysis and ordering
Writing tests Standard Tests follow predictable patterns
Code review Extended Needs thorough examination of edge cases
Simple file edits Standard Minimal reasoning needed
Multi-service changes Extended Needs understanding of service interactions

Reading the Thinking Output

When verbose mode is enabled (claude -v), you can see the thinking blocks. This is valuable for:

  1. Verifying the approach — Is Claude Code reasoning about the right things?
  2. Catching wrong assumptions — If the thinking mentions a wrong assumption about your codebase, you can correct it
  3. Learning — Claude Code's reasoning often reveals insights about your codebase that you might not have considered

Example thinking output:

[Thinking]
The user wants to add caching to the product listing endpoint. Let me consider:

1. Current endpoint reads from PostgreSQL on every request
2. Product data changes infrequently (maybe a few times per day)
3. The CLAUDE.md mentions Redis is available at redis://cache:6379

Approach options:
a) Redis cache with TTL — simple, effective for this use case
b) HTTP cache headers — good for CDN but doesn't reduce DB load for authenticated requests
c) In-memory cache — simple but doesn't share across pods in K8s

Given that they run on Kubernetes (mentioned in CLAUDE.md), option (a) is best
because it shares the cache across all pods. I'll use a 5-minute TTL and
invalidate on product updates.

Let me check the existing caching patterns in the codebase first...

Prompting Strategies for Extended Thinking

Be Explicit About Wanting Analysis

Before implementing anything, analyze the current codebase and propose
an approach. Explain the tradeoffs of different solutions.

Ask for a Plan First

Create a detailed plan for migrating from REST to GraphQL.
Do not make any code changes yet — just produce the plan.

Request Risk Assessment

What could go wrong with this approach? What edge cases might we miss?
What are the failure modes?

Chain Thinking Into Action

Phase 1: Analyze the codebase and create a migration plan (think carefully)
Phase 2: Implement the plan step by step (execute)
Phase 3: Review what you implemented for issues (think carefully again)

Cost Considerations

Extended thinking uses more tokens because the thinking blocks count as output tokens. For Claude Opus 4.6:

  • Standard task (10 tool calls): ~$0.15-0.30
  • Same task with extended thinking: ~$0.25-0.50

The additional cost is usually worth it for complex tasks where a wrong start wastes more time and tokens than the thinking overhead.

Conclusion

Extended thinking transforms Claude Code from a fast-but-sometimes-impulsive coder into a deliberate, analytical problem solver. Use it for architecture decisions, complex debugging, security reviews, and refactoring plans — tasks where thinking before acting prevents costly mistakes. For routine coding tasks, standard mode remains faster and more cost-effective. The key is matching the thinking depth to the task complexity.

Share this article
N

NYC News

Expert insights on AI voice agents and customer communication automation.

Try CallSphere AI Voice Agents

See how AI voice agents work for your industry. Live demo available -- no signup required.