DeepSeek's thinking mode is its most distinctive capability — and the one most likely to confuse users coming from other models. When enabled, the model outputs reasoning_content tokens (its chain-of-thought reasoning) alongside the final content. These tokens are visible, billable at output rates, and must be deliberately managed across turns. This is fundamentally different from every other major provider's reasoning implementation.

How Thinking Mode Works

1. User sends prompt with thinking: { type: "enabled" }
2. Model generates reasoning_content (visible CoT tokens)
3. Model generates content (final answer)
4. Both are returned in the response
5. reasoning_content is billed as output tokens
6. In multi-turn: reasoning_content may or may not need to be passed back (depends on context)

Enabling Thinking Mode

OpenAI-Compatible Format

from openai import OpenAI

client = OpenAI(
    api_key="<DeepSeek API Key>",
    base_url="https://api.deepseek.com"
)

response = client.chat.completions.create(
    model="deepseek-v4-pro",
    messages=[{"role": "user", "content": "Solve this complex problem..."}],
    reasoning_effort="high",  # or "max"
    extra_body={"thinking": {"type": "enabled"}}
)

# Access reasoning content
reasoning = response.choices[0].message.reasoning_content
answer = response.choices[0].message.content

print(f"REASONING:\n{reasoning}\n")
print(f"ANSWER:\n{answer}")

Anthropic-Compatible Format

import anthropic

client = anthropic.Anthropic(
    base_url="https://api.deepseek.com/anthropic",
    api_key="<DeepSeek API Key>"
)

response = client.messages.create(
    model="deepseek-v4-pro",
    max_tokens=4096,
    system="You are a helpful assistant.",
    messages=[{"role": "user", "content": "Solve this complex problem..."}],
    output_config={"effort": "high"}  # or "max"
)

Thinking Mode vs Other Models

Feature	DeepSeek Thinking	Claude Extended Thinking	GPT Chain-of-Thought
Reasoning visibility	Visible (`reasoning_content`)	Hidden (thinking stream, separate access)	Visible in output (clutters response)
Reasoning cost	Billed as output tokens	Billed at same rate, separate budget	Billed as output tokens
Effort control	`high` / `max`	Token budget (`budget_tokens`)	Not applicable
Temperature control	Disabled in thinking mode	Unaffected	Unaffected
Tool call support	Yes (V3.2+)	Yes	Not natively
Multi-turn	Must manage `reasoning_content`	Thinking is per-request, resets	CoT persists in message history

Key Differences from Claude

Visibility: DeepSeek's reasoning is always accessible. Claude's requires special API handling to access the thinking stream.
Temperature: DeepSeek disables temperature and top_p in thinking mode. Claude doesn't — you can still control creativity.
Budget control: Claude uses budget_tokens for thinking allocation. DeepSeek uses reasoning_effort levels. Claude gives you fine-grained token control; DeepSeek gives you coarse high/max.
Multi-turn behavior: DeepSeek's reasoning persists across turns (you manage reasoning_content). Claude's thinking is per-request — each call is independent.

Key Differences from GPT CoT

Clean output: DeepSeek separates reasoning from answer. GPT's CoT is inline — your final answer is buried under a wall of "Let me think step by step..."
Programmatic access: DeepSeek's reasoning_content is a separate field you can parse. GPT's CoT requires string parsing.
Effort control: DeepSeek lets you dial reasoning up/down. GPT's CoT quality depends entirely on your prompt.

Reading `reasoning_content`

The reasoning content is your window into the model's thought process. Use it for:

Debugging Wrong Answers

if answer_is_wrong:
    print(f"MODEL'S REASONING:\n{reasoning}")
    # Look for: faulty assumptions, skipped steps, premature conclusions
    # The error will be visible in the reasoning chain

Quality Signal

# Long, structured reasoning = high confidence
# Short, hand-wavy reasoning = low confidence

if len(reasoning) < 200 or "clearly" in reasoning:
    print("Warning: Model may be overconfident or reasoning insufficiently")

Chain Verification

User prompt: "Verify your own reasoning before giving the final answer."

This prompts DeepSeek to include a self-check in reasoning_content:
"Let me verify: assumption A is correct because... assumption B holds given...
Wait — assumption B conflicts with constraint C. Let me reconsider..."

Streaming Mode

In streaming mode, reasoning_content and content arrive as separate chunks:

stream = client.chat.completions.create(
    model="deepseek-v4-pro",
    messages=messages,
    stream=True,
    reasoning_effort="high",
    extra_body={"thinking": {"type": "enabled"}}
)

reasoning_buffer = ""
answer_buffer = ""

for chunk in stream:
    delta = chunk.choices[0].delta
    if delta.reasoning_content:
        reasoning_buffer += delta.reasoning_content
    elif delta.content:
        answer_buffer += delta.content

Note:

Common Pitfall: In streaming mode, reasoning_content chunks arrive before content chunks. Don't display reasoning_content to end users unless you intend to show the thinking process — it can be verbose and confusing.

Note:

Pro Move: Use reasoning_content as a lightweight "second opinion" signal. If the reasoning seems sound (logical steps, constraint-aware, error-catching), trust the output. If reasoning is hand-wavy, re-prompt with effort=max or verify the output independently.

Reasoning Effort Control — When to use high vs max effort, cost tradeoffs, and the diminishing returns curve.
Multi-Turn Reasoning — Managing reasoning_content across conversation turns and tool-call chains.

DeepSeek Thinking Mode: Enable, Read & Compare

How Thinking Mode Works

Enabling Thinking Mode

OpenAI-Compatible Format

Anthropic-Compatible Format

Thinking Mode vs Other Models

Key Differences from Claude

Key Differences from GPT CoT

Reading `reasoning_content`

Debugging Wrong Answers

Quality Signal

Chain Verification

Streaming Mode

Related Articles

Creative Writing with Gemini: Stories, Scripts & World-Building

Story Development Prompts for ChatGPT

Common Request Prompts: Community Favorites

On this page

DeepSeek Thinking Mode: Enable, Read & Compare

How Thinking Mode Works

Enabling Thinking Mode

OpenAI-Compatible Format

Anthropic-Compatible Format

Thinking Mode vs Other Models

Key Differences from Claude

Key Differences from GPT CoT

Reading reasoning_content

Debugging Wrong Answers

Quality Signal

Chain Verification

Streaming Mode

Related Pages

Related Articles

Creative Writing with Gemini: Stories, Scripts & World-Building

Story Development Prompts for ChatGPT

Common Request Prompts: Community Favorites

On this page

Reading `reasoning_content`