Systems Library / AI Model Setup / How to Use Claude Extended Thinking for Complex Tasks

AI Model Setup advanced

How to Use Claude Extended Thinking for Complex Tasks

Leverage Claude thinking mode for multi-step reasoning and analysis.

Jay Banlasan

The AI Systems Guy

Claude extended thinking for complex analysis gives you access to Claude's internal reasoning process before it produces an answer. When extended thinking is enabled, Claude works through the problem in a scratchpad before writing its response. For tasks that require multi-step logic, trade-off analysis, or reasoning under uncertainty, the difference in output quality is significant. I use this for deal analysis, technical architecture decisions, and any prompt where the stakes of a wrong answer are high.

The tradeoff is cost and latency. Extended thinking uses more tokens because the thinking process itself counts against your usage. Use it selectively: the right tool for hard problems, overkill for simple tasks.

What You Need Before Starting

anthropic library v0.40.0+ (pip install --upgrade anthropic)
An Anthropic API key with access to Claude Sonnet or Opus models
A clear, complex task where standard outputs feel shallow or miss edge cases

Step 1: Enable Extended Thinking

Extended thinking is activated via the thinking parameter in the API call.

import anthropic

client = anthropic.Anthropic()

def think_through(
    problem: str,
    budget_tokens: int = 10000,
    model: str = "claude-sonnet-4-5"
) -> dict:
    response = client.messages.create(
        model=model,
        max_tokens=16000,  # Must be larger than budget_tokens
        thinking={
            "type": "enabled",
            "budget_tokens": budget_tokens
        },
        messages=[{"role": "user", "content": problem}]
    )

    result = {
        "thinking": None,
        "response": None
    }

    for block in response.content:
        if block.type == "thinking":
            result["thinking"] = block.thinking
        elif block.type == "text":
            result["response"] = block.text

    return result

# Test on a complex business question
result = think_through("""
A SaaS company has three pricing options they're considering:
A: Freemium with $49/month pro tier
B: $29/month entry with $99/month pro tier
C: Usage-based at $0.10 per API call

Their current MRR is $45k from 200 paying customers. They target technical teams
at companies with 10-200 employees. Average customer uses the product daily.
CAC is $280. Which pricing model should they adopt and why?
""")

print("=== THINKING ===")
print(result["thinking"][:2000])  # First 2000 chars of thinking
print("\n=== RESPONSE ===")
print(result["response"])

The budget_tokens parameter sets how many tokens Claude can use for thinking. Higher budgets allow deeper reasoning but cost more.

Step 2: Choose the Right Token Budget

Different task types need different thinking budgets.

THINKING_BUDGETS = {
    "simple_analysis": 2000,     # Trade-off comparisons with clear data
    "strategy": 8000,            # Multi-factor business decisions
    "technical_arch": 12000,     # Architecture decisions with complex dependencies
    "adversarial": 16000,        # Problems where you need to steelman counterarguments
    "research_synthesis": 20000  # Synthesizing conflicting information across sources
}

def think_by_complexity(problem: str, complexity: str = "strategy") -> dict:
    budget = THINKING_BUDGETS.get(complexity, 8000)

    return think_through(
        problem=problem,
        budget_tokens=budget,
        model="claude-sonnet-4-5"
    )

Start with strategy budget (8000 tokens) for most business analysis tasks. If you are not getting the depth you need, increase. If the analysis is consistently overkill, decrease.

Step 3: Build a Decision Framework Using Extended Thinking

Extended thinking shines for structured decision analysis.

def analyze_decision(
    decision: str,
    context: str,
    constraints: list,
    success_criteria: list
) -> dict:
    constraints_str = "\n".join([f"- {c}" for c in constraints])
    criteria_str = "\n".join([f"- {c}" for c in success_criteria])

    problem = f"""Analyze this decision thoroughly.

Decision to make: {decision}

Context: {context}

Constraints:
{constraints_str}

Success criteria:
{criteria_str}

Provide:
1. Analysis of the decision space (what options exist)
2. Key trade-offs between options
3. Your recommendation with reasoning
4. What would change your recommendation
5. Biggest risk of the recommended approach"""

    result = think_through(problem, budget_tokens=10000)

    return {
        "decision": decision,
        "thinking_preview": result["thinking"][:500] if result["thinking"] else None,
        "analysis": result["response"]
    }

# Example: ad campaign decision
analysis = analyze_decision(
    decision="Should we scale our best-performing ad set from $200/day to $500/day?",
    context="Ad set has run 7 days. Spend $1,400. 18 leads at $77 CPL. Target CPL $90. Landing page converting at 4.2%.",
    constraints=[
        "Monthly ad budget cap is $8,000",
        "Cannot change creative during scaling",
        "72-hour wait minimum after budget changes"
    ],
    success_criteria=[
        "Maintain CPL under $90",
        "Maintain lead volume growth",
        "Stay within monthly budget"
    ]
)

print(analysis["analysis"])

Step 4: Use Thinking for Code Review and Debugging

Extended thinking is useful for debugging because the model traces through execution paths.

def debug_with_thinking(
    code: str,
    error_description: str,
    expected_behavior: str
) -> dict:
    problem = f"""Debug this code thoroughly.

Code:
```python
{code}

Error/problem: {error_description}

Expected behavior: {expected_behavior}

Trace through the execution carefully. Find the root cause, not just symptoms. Provide the corrected code and explain what was wrong."""

result = think_through(problem, budget_tokens=8000, model="claude-sonnet-4-5")

return {
    "thinking_trace": result["thinking"],
    "fix_and_explanation": result["response"]
}


## Step 5: Stream Extended Thinking for Long Problems

For very long thinking budgets, streaming shows progress instead of waiting for the full response.

```python
def stream_thinking(problem: str, budget_tokens: int = 10000):
    print("Thinking...", flush=True)
    thinking_text = ""
    response_text = ""
    current_block_type = None

    with client.messages.stream(
        model="claude-sonnet-4-5",
        max_tokens=16000,
        thinking={"type": "enabled", "budget_tokens": budget_tokens},
        messages=[{"role": "user", "content": problem}]
    ) as stream:
        for event in stream:
            if hasattr(event, 'type'):
                if event.type == "content_block_start":
                    block = event.content_block
                    current_block_type = block.type
                    if block.type == "text":
                        print("\n=== RESPONSE ===")

                elif event.type == "content_block_delta":
                    delta = event.delta
                    if hasattr(delta, 'thinking'):
                        thinking_text += delta.thinking
                        # Show thinking progress indicator
                        if len(thinking_text) % 500 == 0:
                            print(".", end="", flush=True)
                    elif hasattr(delta, 'text'):
                        response_text += delta.text
                        print(delta.text, end="", flush=True)

    print()
    return {"thinking": thinking_text, "response": response_text}

Step 6: Decide When to Show Thinking to Users

Thinking content is for debugging and auditing, not usually for end users.

def get_thinking_summary(thinking: str, max_words: int = 50) -> str:
    """Summarize the thinking for audit logs without exposing full reasoning chain."""
    words = thinking.split()
    if len(words) <= max_words:
        return thinking

    # Get first and last parts for context
    start = " ".join(words[:20])
    end = " ".join(words[-20:])
    return f"{start}... [{len(words) - 40} words of reasoning] ...{end}"

def production_think(problem: str, log_thinking: bool = True) -> str:
    """Production wrapper that handles thinking internally."""
    result = think_through(problem, budget_tokens=8000)

    if log_thinking and result["thinking"]:
        # Log for audit, don't surface to user
        summary = get_thinking_summary(result["thinking"])
        with open("thinking_log.jsonl", "a") as f:
            import json
            f.write(json.dumps({
                "problem_preview": problem[:200],
                "thinking_summary": summary,
                "thinking_tokens": len(result["thinking"].split())
            }) + "\n")

    return result["response"]

What to Build Next

Build a complexity classifier that automatically routes simple prompts to standard mode and hard problems to extended thinking
Create an analysis archive that stores thinking traces for recurring decision types as a knowledge base
Compare extended thinking vs chain-of-thought prompting on your specific hard task types (tutorial 034)

How to Use Claude Extended Thinking for Complex Tasks

What You Need Before Starting

Step 1: Enable Extended Thinking

Step 2: Choose the Right Token Budget

Step 3: Build a Decision Framework Using Extended Thinking

Step 4: Use Thinking for Code Review and Debugging

Step 6: Decide When to Show Thinking to Users

What to Build Next

Related Reading

Related Systems

How to Write System Prompts That Control AI Behavior

How to Optimize Token Usage to Cut AI Costs

How to Build Persona-Based AI Assistants