We cry about AI tools so you don't have to.

Pricing Watch

DeepSeek V4 Pro's 75% Discount Dies May 31 — What You'll Pay After

The promotional rate on V4 Pro ends in 10 days. We ran the math on what your API bill actually costs when it expires.

pricingapi-costsdeepseekllm-comparison
Disclosure: Tool Crier may earn a commission if you buy through links in this article. We paid for these tools ourselves or tested via free trials. Commissions don't influence what we write.

DeepSeek’s V4 Pro has been running at a 75% promotional discount since launch — $0.87 per million output tokens instead of the regular $3.48/M. That discount expires at 15:59 UTC on May 31, and we ran the math on what that actually means for three real developer workloads.

The discount that’s about to expire

The official DeepSeek pricing docs confirm the promotional window: 75% off through May 31 at 15:59 UTC. After that date, cache-miss input tokens jump from $0.435/M to $1.74/M, and output tokens climb from $0.87/M to $3.48/M—a 4x multiplier on both sides of your token logs.

One critical note: prompt caching can drop input costs dramatically. Cache-hit input is $0.003625/M during the promo and $0.0145/M post-promo—roughly 120x cheaper than cache-miss rates—so repetitive workloads pay much less than these base-rate calculations suggest.

This is the loudest pricing cliff in the API space right now. The question for anyone running inference on DeepSeek isn’t whether to notice—it’s whether the post-discount rate still pencils out for your workload.

Three workloads, three real bills

We picked three representative patterns we see in the dev community: a conversational chatbot, a code-generation agent, and a batch summarization service.

Workload 1: Conversational Chatbot (e.g., customer support)

  • Monthly usage: 5M input tokens, 2M output tokens
  • Current cost (promo rate, cache-miss): (5M × $0.435) + (2M × $0.87) = $3.92/mo
  • Post-June 1 cost: (5M × $1.74) + (2M × $3.48) = $15.66/mo (4x increase)

Workload 2: Code-Assist Agent (e.g., IDE completion, prompt refinement)

  • Monthly usage: 12M input tokens, 4M output tokens
  • Current cost: (12M × $0.435) + (4M × $0.87) = $8.70/mo
  • Post-June 1 cost: (12M × $1.74) + (4M × $3.48) = $34.80/mo (4x increase)

Workload 3: Batch Summarizer (e.g., document processing, log analysis)

  • Monthly usage: 30M input tokens, 8M output tokens
  • Current cost: (30M × $0.435) + (8M × $0.87) = $20.01/mo
  • Post-June 1 cost: (30M × $1.74) + (8M × $3.48) = $80.04/mo (4x increase)

The pattern is exact: whatever your current bill is, multiply by 4 on June 1. That’s the math we’re looking at.

How it stacks against the alternatives

Post-promo DeepSeek still undercuts most of the API market, but not all of it. Here’s how three workloads price across the field:

Output token cost comparison (per million):

For the chatbot workload above (2M output tokens), that’s the difference between $6.96/mo (DeepSeek post-June 1, output side alone) and $1.64/mo (Qwen3-32B output side alone). Not trivial.

What we’d do on June 1

Chatbot and light-duty inference: If you’re building something conversational with low reasoning demands, we’d test Qwen3-32B or Llama 4 Scout on the same inputs. Both are 4-5x cheaper post-June 1, and the open-weights models often surprise you with quality-per-dollar. DeepSeek’s reasoning edge doesn’t matter if you’re not using it.

Heavy reasoning or code work: If your workload actually benefits from DeepSeek’s r1-level thinking (complex problem-solving, algorithm design, multi-step debugging), the post-discount rate is painful but probably still justified. GPT-5.5 at $30/M is more expensive. Claude Opus is in the ballpark. You’re paying for capability here, and it’s worth testing at scale before June 1 to know if the quality delta justifies the cost.

Batch or non-latency-sensitive work: For summarization, classification, or other fire-and-forget patterns, the post-promo DeepSeek rate is hard to defend against the open alternatives. This is where the discount cliff actually changes behavior—we’d shift batch work off DeepSeek entirely.

The bigger pricing trend

This isn’t an isolated move. The LLM API price war we tracked in May shows the whole industry tightening margins post-launch. DeepSeek’s promotional period was always meant to be aggressive, and how Anthropic’s own price hike landed earlier this year signals that the era of sustained discounts is over. Expect more promotions to close, and expect the survivors to be priced on actual capability, not on launch-window aggression.

If you’ve been coasting on DeepSeek’s promo rate without benchmarking alternatives, now is the time. You have 10 days.

← More Pricing Watchs

What we don't know is documented at the end of this article. We update when we learn more.