原文 · 未翻译
Deepseek makes its 75 percent discount permanent, pricing output tokens at least 34x below GPT-5.5
Deepseek's permanent price cut turns China's AI strategy into a blunt price war with Western labs.
Deepseek has made the 75 percent discount on its flagship model, Deepseek V4 Pro, permanent, the company announced on X. The promotion was originally set to expire on May 31, 2026.
Under the permanent discount, one million input tokens without cache cost just $0.435, while one million output tokens cost $0.87. Cache hits push the input price even lower. By comparison, GPT 5.5 charges $5 per million input tokens and $30 per million output tokens, while Opus 4.7 sits at $5 for input and $25 for output.
Model Input per 1M tokens Input cache hit Output per 1M tokens Deepseek-V4-Pro $0.435 $0.003625 $0.87 Deepseek-V4-Flash $0.14 $0.0028 $0.28 GPT-5.5 $5.00 $0.50 $30.00 GPT-5.5 (Long Context, >272K) $10.00 $1.00 $45.00 Opus 4.7 $5.00 $0.50 $25.00
That makes Deepseek's flagship about 11.5 times cheaper than GPT 5.5 on standard input pricing. The gap is much wider on output, where Deepseek V4 Pro is about 34.5 times cheaper. Against GPT 5.5 long context pricing above 272K tokens, Deepseek V4 Pro is about 23 times cheaper on input and about 51.7 times cheaper on output. Deepseek V4 Flash is cheaper still.
Both Deepseek models offer a one million token context window and up to 384,000 output tokens. Deepseek also supports both OpenAI and Anthropic API formats, making it easier for developers to switch.
Token prices only tell half the story
Raw per-token pricing is only part of the picture, though. Token consumption per task matters just as much. Think of it like gas prices: a low price per gallon doesn't help if your engine guzzles fuel.
A good example is Google's Gemini Flash 3.5. On paper, it's cheaper and performs similarly to the previous Pro model 3.1, but it burns through far more tokens, making it potentially pricier in practice. Anthropic's Opus 4.7 looks cheaper on paper than GPT-5.5 too, but uses more tokens than its predecessor. GPT-5.5, on the other hand, consumes fewer tokens than GPT-5.4. Still, both models ended up 30 to 90 percent more expensive than the models they replaced.