DeepSeek V4 Pro Pricing After 75% Discount Ends — Real Costs

The promotional rate on V4 Pro ends in 10 days. We ran the math on what your API bill actually costs when it expires.

DeepSeek’s V4 Pro has been running at a 75% promotional discount since launch — $0.87 per million output tokens instead of the regular $3.48/M. That discount expires at 15:59 UTC on May 31, and we ran the math on what that actually means for three real developer workloads.

The discount that’s about to expire

The official DeepSeek pricing docs confirm the promotional window: 75% off through May 31 at 15:59 UTC. After that date, cache-miss input tokens jump from $0.435/M to $1.74/M, and output tokens climb from $0.87/M to $3.48/M—a 4x multiplier on both sides of your token logs.

One critical note: prompt caching can drop input costs dramatically. Cache-hit input is $0.003625/M during the promo and $0.0145/M post-promo—roughly 120x cheaper than cache-miss rates—so repetitive workloads pay much less than these base-rate calculations suggest.

This is the loudest pricing cliff in the API space right now. The question for anyone running inference on DeepSeek isn’t whether to notice—it’s whether the post-discount rate still pencils out for your workload.

Three workloads, three real bills

We picked three representative patterns we see in the dev community: a conversational chatbot, a code-generation agent, and a batch summarization service.

Workload 1: Conversational Chatbot (e.g., customer support)

Monthly usage: 5M input tokens, 2M output tokens
Current cost (promo rate, cache-miss): (5M × $0.435) + (2M × $0.87) = $3.92/mo
Post-June 1 cost: (5M × $1.74) + (2M × $3.48) = $15.66/mo (4x increase)

Workload 2: Code-Assist Agent (e.g., IDE completion, prompt refinement)

Monthly usage: 12M input tokens, 4M output tokens
Current cost: (12M × $0.435) + (4M × $0.87) = $8.70/mo
Post-June 1 cost: (12M × $1.74) + (4M × $3.48) = $34.80/mo (4x increase)

Workload 3: Batch Summarizer (e.g., document processing, log analysis)

Monthly usage: 30M input tokens, 8M output tokens
Current cost: (30M × $0.435) + (8M × $0.87) = $20.01/mo
Post-June 1 cost: (30M × $1.74) + (8M × $3.48) = $80.04/mo (4x increase)

The pattern is exact: whatever your current bill is, multiply by 4 on June 1. That’s the math we’re looking at.

How it stacks against the alternatives

Post-promo DeepSeek still undercuts most of the API market, but not all of it. Here’s how three workloads price across the field:

Output token cost comparison (per million):

DeepSeek V4 Pro (post-June 1): $3.48/M
Qwen3-32B via Spheron: $0.82/M
Llama 4 Scout via Spheron: $0.87/M
GPT-5.5 standard tier (aipricing.guru): $30.00/M
Claude Opus 4.7: ~$15/M output (varies by tier — see our GPT-5.5 vs Claude vs Gemini cost showdown)

For the chatbot workload above (2M output tokens), that’s the difference between $6.96/mo (DeepSeek post-June 1, output side alone) and $1.64/mo (Qwen3-32B output side alone). Not trivial.

What we’d do on June 1

Chatbot and light-duty inference: If you’re building something conversational with low reasoning demands, we’d test Qwen3-32B or Llama 4 Scout on the same inputs. Both are 4-5x cheaper post-June 1, and the open-weights models often surprise you with quality-per-dollar. DeepSeek’s reasoning edge doesn’t matter if you’re not using it.

Heavy reasoning or code work: If your workload actually benefits from DeepSeek’s r1-level thinking (complex problem-solving, algorithm design, multi-step debugging), the post-discount rate is painful but probably still justified. GPT-5.5 at $30/M is more expensive. Claude Opus is in the ballpark. You’re paying for capability here, and it’s worth testing at scale before June 1 to know if the quality delta justifies the cost.

Batch or non-latency-sensitive work: For summarization, classification, or other fire-and-forget patterns, the post-promo DeepSeek rate is hard to defend against the open alternatives. This is where the discount cliff actually changes behavior—we’d shift batch work off DeepSeek entirely.

The bigger pricing trend

This isn’t an isolated move. The LLM API price war we tracked in May shows the whole industry tightening margins post-launch. DeepSeek’s promotional period was always meant to be aggressive, and how Anthropic’s own price hike landed earlier this year signals that the era of sustained discounts is over. Expect more promotions to close, and expect the survivors to be priced on actual capability, not on launch-window aggression.

If you’ve been coasting on DeepSeek’s promo rate without benchmarking alternatives, now is the time. You have 10 days.

DeepSeek V4 Pro's 75% Discount Dies May 31 — What You'll Pay After

The discount that’s about to expire

Three workloads, three real bills

How it stacks against the alternatives

What we’d do on June 1

The bigger pricing trend