DeepSeek 将 75% 折扣永久化，输出 token 定价至少低于 GPT-5.5 的 34 倍

2026-05-24 01:10·40天前·Matthias Bastian

AI 摘要

DeepSeek 将针对其旗舰模型 V4-Pro 的 75% 折扣调整为永久性降价。调整后，输入 token 的价格为每百万 0.435 美元，相比 GPT-5.5 至少便宜 11.5 倍；输出 token 的价格优势更为显著，至少低 34 倍。如此激进的定价策略，对于 token 消耗量巨大的智能体系统而言，将对西方人工智能服务商构成显著的价格压力。

原文 · 未翻译

Deepseek makes its 75 percent discount permanent, pricing output tokens at least 34x below GPT-5.5

Deepseek's permanent price cut turns China's AI strategy into a blunt price war with Western labs.

Deepseek has made the 75 percent discount on its flagship model, Deepseek V4 Pro, permanent, the company announced on X. The promotion was originally set to expire on May 31, 2026.

Under the permanent discount, one million input tokens without cache cost just $0.435, while one million output tokens cost $0.87. Cache hits push the input price even lower. By comparison, GPT 5.5 charges $5 per million input tokens and $30 per million output tokens, while Opus 4.7 sits at $5 for input and $25 for output.

Model Input per 1M tokens Input cache hit Output per 1M tokens Deepseek-V4-Pro $0.435 $0.003625 $0.87 Deepseek-V4-Flash $0.14 $0.0028 $0.28 GPT-5.5 $5.00 $0.50 $30.00 GPT-5.5 (Long Context, >272K) $10.00 $1.00 $45.00 Opus 4.7 $5.00 $0.50 $25.00

That makes Deepseek's flagship about 11.5 times cheaper than GPT 5.5 on standard input pricing. The gap is much wider on output, where Deepseek V4 Pro is about 34.5 times cheaper. Against GPT 5.5 long context pricing above 272K tokens, Deepseek V4 Pro is about 23 times cheaper on input and about 51.7 times cheaper on output. Deepseek V4 Flash is cheaper still.

Both Deepseek models offer a one million token context window and up to 384,000 output tokens. Deepseek also supports both OpenAI and Anthropic API formats, making it easier for developers to switch.

Token prices only tell half the story

Raw per-token pricing is only part of the picture, though. Token consumption per task matters just as much. Think of it like gas prices: a low price per gallon doesn't help if your engine guzzles fuel.

A good example is Google's Gemini Flash 3.5. On paper, it's cheaper and performs similarly to the previous Pro model 3.1, but it burns through far more tokens, making it potentially pricier in practice. Anthropic's Opus 4.7 looks cheaper on paper than GPT-5.5 too, but uses more tokens than its predecessor. GPT-5.5, on the other hand, consumes fewer tokens than GPT-5.4. Still, both models ended up 30 to 90 percent more expensive than the models they replaced.

The Decoder：AI News（RSS）

67导出 Markdown