DeepSeek OpenAI-Compatible API: SDK Integration & Migration

Configure DeepSeek's OpenAI-compatible API and Anthropic API format. SDK integration patterns, LangChain support, migration from OpenAI/Anthropic, streaming differences, and model mapping behavior.

June 11, 2026
DeepSeekAPIOpenAIAnthropicSDKLangChainMigration

DeepSeek's API is uniquely dual-format — supporting both OpenAI-compatible and Anthropic API endpoints. Migration from either ecosystem is typically a single-line base_url change. But there are format-specific differences in thinking mode, parameter support, and streaming behavior that you need to know before migrating production workloads.

OpenAI-Compatible Format

from openai import OpenAI

client = OpenAI(
    api_key="<DeepSeek API Key>",
    base_url="https://api.deepseek.com"
)

response = client.chat.completions.create(
    model="deepseek-v4-pro",
    messages=[{"role": "user", "content": "Hello"}],
    reasoning_effort="high",
    extra_body={"thinking": {"type": "enabled"}},
    stream=False
)
import OpenAI from "openai";

const client = new OpenAI({
    baseURL: "https://api.deepseek.com",
    apiKey: "<DeepSeek API Key>",
});

const completion = await client.chat.completions.create({
    model: "deepseek-v4-pro",
    messages: [{ role: "user", content: "Hello" }],
});

Anthropic API Format

import anthropic

client = anthropic.Anthropic(
    base_url="https://api.deepseek.com/anthropic",
    api_key="<DeepSeek API Key>"
)

response = client.messages.create(
    model="deepseek-v4-pro",
    max_tokens=4096,
    system="You are a helpful assistant.",
    messages=[{"role": "user", "content": "Hello"}]
)

Migration from OpenAI

Step 1: Change base URL

# Before
client = OpenAI(api_key=openai_key)

# After
client = OpenAI(
    api_key=deepseek_key,
    base_url="https://api.deepseek.com"
)

Step 2: Update model name

# Before
model="gpt-4o"

# After
model="deepseek-v4-pro"

Step 3: Move thinking parameter (if using reasoning)

# GPT doesn't have a native reasoning mode — CoT is prompt-based
# DeepSeek: Enable thinking mode
response = client.chat.completions.create(
    model="deepseek-v4-pro",
    messages=messages,
    reasoning_effort="high",
    extra_body={"thinking": {"type": "enabled"}}
)

Migration Checklist

CheckOpenAIDeepSeek
temperatureSupportedIgnored in thinking mode
top_pSupportedIgnored in thinking mode
response_format: json_objectSupportedSupported (known empty-content issue)
toolsSupportedSupported (+ strict mode beta)
streamSupportedSupported (keep-alive lines in response)
max_tokensSupportedSupported (max 384K output)
stopSupportedSupported

Migration from Anthropic/Claude

Step 1: Change base URL

# Before
client = anthropic.Anthropic(api_key=anthropic_key)

# After
client = anthropic.Anthropic(
    base_url="https://api.deepseek.com/anthropic",
    api_key=deepseek_key
)

Step 2: Auto Model Mapping

DeepSeek automatically maps Claude model names — you don't even need to change the model parameter:

  • claude-opus-*deepseek-v4-pro
  • claude-sonnet-* or claude-haiku-*deepseek-v4-flash

Migration Checklist

Claude FeatureDeepSeek Support
thinking with budget_tokensbudget_tokens IGNORED — use output_config.effort
temperatureSupported (0.0-2.0) in non-thinking mode
toolsFully supported
system promptFully supported
stop_sequencesFully supported
cache_controlIGNORED — use DeepSeek's automatic context caching
Image inputNOT supported
Document inputNOT supported
MCP serversNOT supported through Anthropic format

LangChain Integration

from langchain.chat_models import ChatOpenAI

llm = ChatOpenAI(
    model="deepseek-v4-pro",
    openai_api_key="<DeepSeek API Key>",
    openai_api_base="https://api.deepseek.com"
)

Streaming Differences

Non-streaming (stream=false, default): DeepSeek returns empty lines as TCP keep-alive during processing. If you parse HTTP responses directly, handle these empty lines.

Streaming (stream=true): DeepSeek sends SSE keep-alive comments (: keep-alive) between chunks. Standard OpenAI SDK handles these automatically. If using raw SSE parsing, filter out lines starting with :.

# Robust streaming pattern
stream = client.chat.completions.create(
    model="deepseek-v4-pro",
    messages=messages,
    stream=True,
    extra_body={"thinking": {"type": "enabled"}}
)

for chunk in stream:
    delta = chunk.choices[0].delta
    if delta.reasoning_content:
        # Thinking tokens arrive first
        yield {"type": "reasoning", "content": delta.reasoning_content}
    elif delta.content:
        # Answer tokens arrive after reasoning
        yield {"type": "content", "content": delta.content}

Error Handling

from openai import APIError, RateLimitError

try:
    response = client.chat.completions.create(...)
except RateLimitError:
    # 429 — concurrent request limit exceeded (Flash: 2500, Pro: 500)
    time.sleep(1)
    retry()
except APIError as e:
    if e.status_code == 400 and "reasoning_content" in str(e):
        # Missing reasoning_content in tool-call loop
        # Fix: Always pass full message object
        pass
    elif e.status_code == 402:
        # Insufficient balance
        pass

Note:

Pro Move: When migrating from Claude to DeepSeek via the Anthropic API format, keep your existing Claude model names in code. DeepSeek auto-maps them. This means zero code changes beyond base_url and api_key — your entire Claude toolchain works immediately with DeepSeek as the backend.

Note:

Migration gotcha: If your Claude code uses cache_control for prompt caching, it will be silently ignored by DeepSeek. DeepSeek uses automatic prefix-match caching instead. You don't need to add cache markers — the system handles it automatically.