DeepSeek 把百万 token 上下文从论文概念拉到了可下载的权重,这意味着长文档、代码库级别的 RAG 和 Agent 终于有了一个真正能跑的底座,做企业级应用的团队该认真测一下。
DeepSeek-V4模型正式发布,其上下文处理能力显著提升至128万令牌,是前代模型的4倍。该模型在多项基准测试中表现优异,尤其在长文本任务上展现出强大性能。其架构经过优化,实现了训练与推理效率的大幅提升,同时保持了极具竞争力的成本效益。模型权重已在Hugging Face平台开源,可供研究社区使用。
技术报告👁️
我们推出了DeepSeek-V4系列的预览版,包括两个强大的混合专家(MoE)语言模型——DeepSeek-V4-Pro(1.6T参数,其中49B激活)和DeepSeek-V4-Flash(284B参数,其中13B激活)——两者均支持百万token的上下文长度。
DeepSeek-V4系列在架构和优化方面引入了多项关键升级:
我们在超过32T多样且高质量的token上预训练了两个模型,随后进行全面的后训练流程。后训练采用两阶段范式:首先通过SFT和基于GRPO的RL独立培养领域专家,然后通过在线策略蒸馏进行统一模型整合,将不同领域的专长融合到单一模型中。
DeepSeek-V4-Pro-Max(DeepSeek-V4-Pro的最大推理努力模式)显著提升了开源模型的知识能力,牢固确立了其作为当前最佳开源模型的地位。它在编程基准测试中达到顶级性能,并在推理和智能体任务上显著缩小了与领先闭源模型的差距。同时,DeepSeek-V4-Flash-Max在获得更大的推理预算时,推理能力与Pro版本相当,但由于其参数规模较小,在纯知识任务和最复杂的智能体工作流上自然略逊一筹。
| 模型 | #总参数量 | #激活参数量 | 上下文长度 | 精度 | 下载 |
|---|---|---|---|---|---|
| DeepSeek-V4-Flash-Base | 284B | 13B | 1M | FP8 混合 | HuggingFace | ModelScope |
| DeepSeek-V4-Flash | 284B | 13B | 1M | FP4 + FP8 混合* | HuggingFace | ModelScope |
| DeepSeek-V4-Pro-Base | 1.6T | 49B | 1M | FP8 混合 | HuggingFace | ModelScope |
| DeepSeek-V4-Pro | 1.6T | 49B | 1M | FP4 + FP8 混合* | HuggingFace | ModelScope |
*FP4 + FP8 混合:MoE 专家参数使用 FP4 精度;其他大部分参数使用 FP8。
| 基准(指标) | # Shots | DeepSeek-V3.2-Base | DeepSeek-V4-Flash-Base | DeepSeek-V4-Pro-Base |
|---|---|---|---|---|
| 架构 | - | MoE | MoE | MoE |
| # 激活参数量 | - | 37B | 13B | 49B |
| # 总参数量 | - | 671B | 284B | 1.6T |
| 世界知识 | ||||
| AGIEval(EM) | 0-shot | 80.1 | 82.6 | 83.1 |
| MMLU(EM) | 5-shot | 87.8 | 88.7 | 90.1 |
| MMLU-Redux(EM) | 5-shot | 87.5 | 89.4 | 90.8 |
| MMLU-Pro(EM) | 5-shot | 65.5 | 68.3 | 73.5 |
| MMMLU(EM) | 5-shot | 87.9 | 88.8 | 90.3 |
| C-Eval(EM) | 5-shot | 90.4 | 92.1 | 93.1 |
| CMMLU(EM) | 5-shot | 88.9 | 90.4 | 90.8 |
| MultiLoKo(EM) | 5-shot | 38.7 | 42.2 | 51.1 |
| Simple-QA verified(EM) | 25-shot | 28.3 | 30.1 | 55.2 |
| SuperGPQA(EM) | 5-shot | 45.0 | 46.5 | 53.9 |
| FACTS Parametric(EM) | 25-shot | 27.1 | 33.9 | 62.6 |
| TriviaQA(EM) | 5-shot | 83.3 | 82.8 | 85.6 |
| 语言与推理 | ||||
| BBH(EM) | 3-shot | 87.6 | 86.9 | 87.5 |
| DROP(F1) | 1-shot | 88.2 | 88.6 | 88.7 |
| HellaSwag(EM) | 0-shot | 86.4 | 85.7 | 88.0 |
| WinoGrande(EM) | 0-shot | 78.9 | 79.5 | 81.5 |
| CLUEWSC(EM) | 5-shot | 83.5 | 82.2 | 85.2 |
| 代码与数学 | ||||
| BigCodeBench(Pass@1) | 3-shot | 63.9 | 56.8 | 59.2 |
| HumanEval(Pass@1) | 0-shot | 62.8 | 69.5 | 76.8 |
| GSM8K(EM) | 8-shot | 91.1 | 90.8 | 92.6 |
| MATH(EM) | 4-shot | 60.5 | 57.4 | 64.5 |
| MGSM(EM) | 8-shot | 81.3 | 85.7 | 84.4 |
| CMath(EM) | 3-shot | 92.6 | 93.6 | 90.9 |
| 长上下文 | ||||
| LongBench-V2(EM) | 1-shot | 40.2 | 44.7 | 51.5 |
DeepSeek-V4-Pro 和 DeepSeek-V4-Flash 均支持三种推理努力模式:
| 推理模式 | 特点 | 典型用例 | 响应格式 |
|---|---|---|---|
| 非思考 | 快速、直觉式响应 | 日常例行任务、低风险决策 | </think> 摘要 |
| 高思考 | 有意识的逻辑分析,速度较慢但更准确 | 复杂问题求解、规划 | <think> 思考 </think> 摘要 |
| 最大思考 | 将推理推至极限 | 探索模型推理能力的边界 | 特殊系统提示词 + <think> 思考 </think> 摘要 |
| 基准(指标) | Opus-4.6 Max | GPT-5.4 xHigh | Gemini-3.1-Pro High | K2.6 Thinking | GLM-5.1 Thinking | DS-V4-Pro Max |
|---|---|---|---|---|---|---|
| 知识与推理 | ||||||
| MMLU-Pro(EM) | 89.1 | 87.5 | 91.0 | 87.1 | 86.0 | 87.5 |
| SimpleQA-Verified(Pass@1) | 46.2 | 45.3 | 75.6 | 36.9 | 38.1 | 57.9 |
| Chinese-SimpleQA (Pass@1) | 76.4 | 76.8 | 85.9 | 75.9 | 75.0 | 84.4 |
| GPQA Diamond (Pass@1) | 91.3 | 93.0 | 94.3 | 90.5 | 86.2 | 90.1 |
| HLE (Pass@1) | 40.0 | 39.8 | 44.4 | 36.4 | 34.7 | 37.7 |
| LiveCodeBench (Pass@1) | 88.8 | - | 91.7 | 89.6 | - | 93.5 |
| Codeforces (Rating) | - | 3168 | 3052 | - | - | 3206 |
| HMMT 2026 Feb (Pass@1) | 96.2 | 97.7 | 94.7 | 92.7 | 89.4 | 95.2 |
| IMOAnswerBench (Pass@1) | 75.3 | 91.4 | 81.0 | 86.0 | 83.8 | 89.8 |
| Apex (Pass@1) | 34.5 | 54.1 | 60.9 | 24.0 | 11.5 | 38.3 |
| Apex Shortlist (Pass@1) | 85.9 | 78.1 | 89.1 | 75.5 | 72.4 | 90.2 |
| Long Context | ||||||
| MRCR 1M (MMR) | 92.9 | - | 76.3 | - | - | 83.5 |
| CorpusQA 1M (ACC) | 71.7 | - | 53.8 | - | - | 62.0 |
| Agentic | ||||||
| Terminal Bench 2.0 (Acc) | 65.4 | 75.1 | 68.5 | 66.7 | 63.5 | 67.9 |
| SWE Verified (Resolved) | 80.8 | - | 80.6 | 80.2 | - | 80.6 |
| SWE Pro (Resolved) | 57.3 | 57.7 | 54.2 | 58.6 | 58.4 | 55.4 |
| SWE Multilingual (Resolved) | 77.5 | - | - | 76.7 | 73.3 | 76.2 |
| BrowseComp (Pass@1) | 83.7 | 82.7 | 85.9 | 83.2 | 79.3 | 83.4 |
| HLE w/ tools (Pass@1) | 53.1 | 52.0 | 51.6 | 54.0 | 50.4 | 48.2 |
| GDPval-AA (Elo) | 1619 | 1674 | 1314 | 1482 | 1535 | 1554 |
| MCPAtlas Public (Pass@1) | 73.8 | 67.2 | 69.2 | 66.6 | 71.8 | 73.6 |
| Toolathlon (Pass@1) | 47.2 | 54.6 | 48.8 | 50.0 | 40.7 | 51.8 |
| Benchmark (Metric) | V4-Flash Non-Think | V4-Flash High | V4-Flash Max | V4-Pro Non-Think | V4-Pro High | V4-Pro Max |
|---|---|---|---|---|---|---|
| Knowledge & Reasoning | ||||||
| MMLU-Pro (EM) | 83.0 | 86.4 | 86.2 | 82.9 | 87.1 | 87.5 |
| SimpleQA-Verified (Pass@1) | 23.1 | 28.9 | 34.1 | 45.0 | 46.2 | 57.9 |
| Chinese-SimpleQA (Pass@1) | 71.5 | 73.2 | 78.9 | 75.8 | 77.7 | 84.4 |
| GPQA Diamond (Pass@1) | 71.2 | 87.4 | 88.1 | 72.9 | 89.1 | 90.1 |
| HLE (Pass@1) | 8.1 | 29.4 | 34.8 | 7.7 | 34.5 | 37.7 |
| LiveCodeBench (Pass@1) | 55.2 | 88.4 | 91.6 | 56.8 | 89.8 | 93.5 |
| Codeforces (Rating) | - | 2816 | 3052 | - | 2919 | 3206 |
| HMMT 2026 Feb (Pass@1) | 40.8 | 91.9 | 94.8 | 31.7 | 94.0 | 95.2 |
| IMOAnswerBench (Pass@1) | 41.9 | 85.1 | 88.4 | 35.3 | 88.0 | 89.8 |
| Apex (Pass@1) | 1.0 | 19.1 | 33.0 | 0.4 | 27.4 | 38.3 |
| Apex Shortlist (Pass@1) | 9.3 | 72.1 | 85.7 | 9.2 | 85.5 | 90.2 |
| Long Context | ||||||
| MRCR 1M (MMR) | 37.5 | 76.9 | 78.7 | 44.7 | 83.3 | 83.5 |
| CorpusQA 1M (ACC) | 15.5 | 59.3 | 60.5 | 35.6 | 56.5 | 62.0 |
| Agentic | ||||||
| Terminal Bench 2.0 (Acc) | 49.1 | 56.6 | 56.9 | 59.1 | 63.3 | 67.9 |
| SWE Verified (Resolved) | 73.7 | 78.6 | 79.0 | 73.6 | 79.4 | 80.6 |
| SWE Pro (Resolved) | 49.1 | 52.3 | 52.6 | 52.1 | 54.4 | 55.4 |
| SWE Multilingual (Resolved) | 69.7 | 70.2 | 73.3 | 69.8 | 74.1 | 76.2 |
| BrowseComp (Pass@1) | - | 53.5 | 73.2 | - | 80.4 | 83.4 |
| HLE w/ tools (Pass@1) | - | 40.3 | 45.1 | - | 44.7 | 48.2 |
| MCPAtlas (Pass@1) | 64.0 | 67.4 | 69.0 | 69.4 | 74.2 | 73.6 |
| GDPval-AA (Elo) | - | - | 1395 | - | - | 1554 |
| Toolathlon (Pass@1) | 40.7 | 43.5 | 47.8 | 46.3 | 49.0 | 51.8 |
本次发布不包含 Jinja 格式的聊天模板。我们提供了专用的编码文件夹,内含 Python 脚本和测试用例,演示如何将以 OpenAI 兼容格式编码的消息转换为模型的输入字符串,以及如何解析模型的文本输出。完整文档请参阅编码文件夹。
简单示例如下:
from encoding_dsv4 import encode_messages, parse_message_from_completion_text
messages = [
{"role": "user", "content": "hello"},
{"role": "assistant", "content": "Hello! I am DeepSeek.", "reasoning_content": "thinking..."},
{"role": "user", "content": "1+1=?"}
]
prompt = encode_messages(messages, thinking_mode="thinking")
import transformers
tokenizer = transformers.AutoTokenizer.from_pretrained("deepseek-ai/DeepSeek-V4-Pro")
tokens = tokenizer.encode(prompt)
关于在本地运行 DeepSeek-V4 的详细说明,包括模型权重转换和交互式聊天演示,请参阅推理文件夹。
本地部署时,建议将采样参数设置为 temperature = 1.0、top_p = 1.0。对于 Think Max 推理模式,建议将上下文窗口设置为至少 384K tokens。
本仓库及模型权重采用 MIT 许可证授权。
@misc{deepseekai2026deepseekv4,
title={DeepSeek-V4: Towards Highly Efficient Million-Token Context Intelligence},
author={DeepSeek-AI},
year={2026},
}
如有任何问题,请提交 issue 或联系 service@deepseek.com。
DeepSeek 把百万 token 上下文从论文概念拉到了可下载的权重,这意味着长文档、代码库级别的 RAG 和 Agent 终于有了一个真正能跑的底座,做企业级应用的团队该认真测一下。
DeepSeek-V4模型正式发布,其上下文处理能力显著提升至128万令牌,是前代模型的4倍。该模型在多项基准测试中表现优异,尤其在长文本任务上展现出强大性能。其架构经过优化,实现了训练与推理效率的大幅提升,同时保持了极具竞争力的成本效益。模型权重已在Hugging Face平台开源,可供研究社区使用。
技术报告👁️
我们推出了DeepSeek-V4系列的预览版,包括两个强大的混合专家(MoE)语言模型——DeepSeek-V4-Pro(1.6T参数,其中49B激活)和DeepSeek-V4-Flash(284B参数,其中13B激活)——两者均支持百万token的上下文长度。
DeepSeek-V4系列在架构和优化方面引入了多项关键升级:
我们在超过32T多样且高质量的token上预训练了两个模型,随后进行全面的后训练流程。后训练采用两阶段范式:首先通过SFT和基于GRPO的RL独立培养领域专家,然后通过在线策略蒸馏进行统一模型整合,将不同领域的专长融合到单一模型中。
DeepSeek-V4-Pro-Max(DeepSeek-V4-Pro的最大推理努力模式)显著提升了开源模型的知识能力,牢固确立了其作为当前最佳开源模型的地位。它在编程基准测试中达到顶级性能,并在推理和智能体任务上显著缩小了与领先闭源模型的差距。同时,DeepSeek-V4-Flash-Max在获得更大的推理预算时,推理能力与Pro版本相当,但由于其参数规模较小,在纯知识任务和最复杂的智能体工作流上自然略逊一筹。
| 模型 | #总参数量 | #激活参数量 | 上下文长度 | 精度 | 下载 |
|---|---|---|---|---|---|
| DeepSeek-V4-Flash-Base | 284B | 13B | 1M | FP8 混合 | HuggingFace | ModelScope |
| DeepSeek-V4-Flash | 284B | 13B | 1M | FP4 + FP8 混合* | HuggingFace | ModelScope |
| DeepSeek-V4-Pro-Base | 1.6T | 49B | 1M | FP8 混合 | HuggingFace | ModelScope |
| DeepSeek-V4-Pro | 1.6T | 49B | 1M | FP4 + FP8 混合* | HuggingFace | ModelScope |
*FP4 + FP8 混合:MoE 专家参数使用 FP4 精度;其他大部分参数使用 FP8。
| 基准(指标) | # Shots | DeepSeek-V3.2-Base | DeepSeek-V4-Flash-Base | DeepSeek-V4-Pro-Base |
|---|---|---|---|---|
| 架构 | - | MoE | MoE | MoE |
| # 激活参数量 | - | 37B | 13B | 49B |
| # 总参数量 | - | 671B | 284B | 1.6T |
| 世界知识 | ||||
| AGIEval(EM) | 0-shot | 80.1 | 82.6 | 83.1 |
| MMLU(EM) | 5-shot | 87.8 | 88.7 | 90.1 |
| MMLU-Redux(EM) | 5-shot | 87.5 | 89.4 | 90.8 |
| MMLU-Pro(EM) | 5-shot | 65.5 | 68.3 | 73.5 |
| MMMLU(EM) | 5-shot | 87.9 | 88.8 | 90.3 |
| C-Eval(EM) | 5-shot | 90.4 | 92.1 | 93.1 |
| CMMLU(EM) | 5-shot | 88.9 | 90.4 | 90.8 |
| MultiLoKo(EM) | 5-shot | 38.7 | 42.2 | 51.1 |
| Simple-QA verified(EM) | 25-shot | 28.3 | 30.1 | 55.2 |
| SuperGPQA(EM) | 5-shot | 45.0 | 46.5 | 53.9 |
| FACTS Parametric(EM) | 25-shot | 27.1 | 33.9 | 62.6 |
| TriviaQA(EM) | 5-shot | 83.3 | 82.8 | 85.6 |
| 语言与推理 | ||||
| BBH(EM) | 3-shot | 87.6 | 86.9 | 87.5 |
| DROP(F1) | 1-shot | 88.2 | 88.6 | 88.7 |
| HellaSwag(EM) | 0-shot | 86.4 | 85.7 | 88.0 |
| WinoGrande(EM) | 0-shot | 78.9 | 79.5 | 81.5 |
| CLUEWSC(EM) | 5-shot | 83.5 | 82.2 | 85.2 |
| 代码与数学 | ||||
| BigCodeBench(Pass@1) | 3-shot | 63.9 | 56.8 | 59.2 |
| HumanEval(Pass@1) | 0-shot | 62.8 | 69.5 | 76.8 |
| GSM8K(EM) | 8-shot | 91.1 | 90.8 | 92.6 |
| MATH(EM) | 4-shot | 60.5 | 57.4 | 64.5 |
| MGSM(EM) | 8-shot | 81.3 | 85.7 | 84.4 |
| CMath(EM) | 3-shot | 92.6 | 93.6 | 90.9 |
| 长上下文 | ||||
| LongBench-V2(EM) | 1-shot | 40.2 | 44.7 | 51.5 |
DeepSeek-V4-Pro 和 DeepSeek-V4-Flash 均支持三种推理努力模式:
| 推理模式 | 特点 | 典型用例 | 响应格式 |
|---|---|---|---|
| 非思考 | 快速、直觉式响应 | 日常例行任务、低风险决策 | </think> 摘要 |
| 高思考 | 有意识的逻辑分析,速度较慢但更准确 | 复杂问题求解、规划 | <think> 思考 </think> 摘要 |
| 最大思考 | 将推理推至极限 | 探索模型推理能力的边界 | 特殊系统提示词 + <think> 思考 </think> 摘要 |
| 基准(指标) | Opus-4.6 Max | GPT-5.4 xHigh | Gemini-3.1-Pro High | K2.6 Thinking | GLM-5.1 Thinking | DS-V4-Pro Max |
|---|---|---|---|---|---|---|
| 知识与推理 | ||||||
| MMLU-Pro(EM) | 89.1 | 87.5 | 91.0 | 87.1 | 86.0 | 87.5 |
| SimpleQA-Verified(Pass@1) | 46.2 | 45.3 | 75.6 | 36.9 | 38.1 | 57.9 |
| Chinese-SimpleQA (Pass@1) | 76.4 | 76.8 | 85.9 | 75.9 | 75.0 | 84.4 |
| GPQA Diamond (Pass@1) | 91.3 | 93.0 | 94.3 | 90.5 | 86.2 | 90.1 |
| HLE (Pass@1) | 40.0 | 39.8 | 44.4 | 36.4 | 34.7 | 37.7 |
| LiveCodeBench (Pass@1) | 88.8 | - | 91.7 | 89.6 | - | 93.5 |
| Codeforces (Rating) | - | 3168 | 3052 | - | - | 3206 |
| HMMT 2026 Feb (Pass@1) | 96.2 | 97.7 | 94.7 | 92.7 | 89.4 | 95.2 |
| IMOAnswerBench (Pass@1) | 75.3 | 91.4 | 81.0 | 86.0 | 83.8 | 89.8 |
| Apex (Pass@1) | 34.5 | 54.1 | 60.9 | 24.0 | 11.5 | 38.3 |
| Apex Shortlist (Pass@1) | 85.9 | 78.1 | 89.1 | 75.5 | 72.4 | 90.2 |
| Long Context | ||||||
| MRCR 1M (MMR) | 92.9 | - | 76.3 | - | - | 83.5 |
| CorpusQA 1M (ACC) | 71.7 | - | 53.8 | - | - | 62.0 |
| Agentic | ||||||
| Terminal Bench 2.0 (Acc) | 65.4 | 75.1 | 68.5 | 66.7 | 63.5 | 67.9 |
| SWE Verified (Resolved) | 80.8 | - | 80.6 | 80.2 | - | 80.6 |
| SWE Pro (Resolved) | 57.3 | 57.7 | 54.2 | 58.6 | 58.4 | 55.4 |
| SWE Multilingual (Resolved) | 77.5 | - | - | 76.7 | 73.3 | 76.2 |
| BrowseComp (Pass@1) | 83.7 | 82.7 | 85.9 | 83.2 | 79.3 | 83.4 |
| HLE w/ tools (Pass@1) | 53.1 | 52.0 | 51.6 | 54.0 | 50.4 | 48.2 |
| GDPval-AA (Elo) | 1619 | 1674 | 1314 | 1482 | 1535 | 1554 |
| MCPAtlas Public (Pass@1) | 73.8 | 67.2 | 69.2 | 66.6 | 71.8 | 73.6 |
| Toolathlon (Pass@1) | 47.2 | 54.6 | 48.8 | 50.0 | 40.7 | 51.8 |
| Benchmark (Metric) | V4-Flash Non-Think | V4-Flash High | V4-Flash Max | V4-Pro Non-Think | V4-Pro High | V4-Pro Max |
|---|---|---|---|---|---|---|
| Knowledge & Reasoning | ||||||
| MMLU-Pro (EM) | 83.0 | 86.4 | 86.2 | 82.9 | 87.1 | 87.5 |
| SimpleQA-Verified (Pass@1) | 23.1 | 28.9 | 34.1 | 45.0 | 46.2 | 57.9 |
| Chinese-SimpleQA (Pass@1) | 71.5 | 73.2 | 78.9 | 75.8 | 77.7 | 84.4 |
| GPQA Diamond (Pass@1) | 71.2 | 87.4 | 88.1 | 72.9 | 89.1 | 90.1 |
| HLE (Pass@1) | 8.1 | 29.4 | 34.8 | 7.7 | 34.5 | 37.7 |
| LiveCodeBench (Pass@1) | 55.2 | 88.4 | 91.6 | 56.8 | 89.8 | 93.5 |
| Codeforces (Rating) | - | 2816 | 3052 | - | 2919 | 3206 |
| HMMT 2026 Feb (Pass@1) | 40.8 | 91.9 | 94.8 | 31.7 | 94.0 | 95.2 |
| IMOAnswerBench (Pass@1) | 41.9 | 85.1 | 88.4 | 35.3 | 88.0 | 89.8 |
| Apex (Pass@1) | 1.0 | 19.1 | 33.0 | 0.4 | 27.4 | 38.3 |
| Apex Shortlist (Pass@1) | 9.3 | 72.1 | 85.7 | 9.2 | 85.5 | 90.2 |
| Long Context | ||||||
| MRCR 1M (MMR) | 37.5 | 76.9 | 78.7 | 44.7 | 83.3 | 83.5 |
| CorpusQA 1M (ACC) | 15.5 | 59.3 | 60.5 | 35.6 | 56.5 | 62.0 |
| Agentic | ||||||
| Terminal Bench 2.0 (Acc) | 49.1 | 56.6 | 56.9 | 59.1 | 63.3 | 67.9 |
| SWE Verified (Resolved) | 73.7 | 78.6 | 79.0 | 73.6 | 79.4 | 80.6 |
| SWE Pro (Resolved) | 49.1 | 52.3 | 52.6 | 52.1 | 54.4 | 55.4 |
| SWE Multilingual (Resolved) | 69.7 | 70.2 | 73.3 | 69.8 | 74.1 | 76.2 |
| BrowseComp (Pass@1) | - | 53.5 | 73.2 | - | 80.4 | 83.4 |
| HLE w/ tools (Pass@1) | - | 40.3 | 45.1 | - | 44.7 | 48.2 |
| MCPAtlas (Pass@1) | 64.0 | 67.4 | 69.0 | 69.4 | 74.2 | 73.6 |
| GDPval-AA (Elo) | - | - | 1395 | - | - | 1554 |
| Toolathlon (Pass@1) | 40.7 | 43.5 | 47.8 | 46.3 | 49.0 | 51.8 |
本次发布不包含 Jinja 格式的聊天模板。我们提供了专用的编码文件夹,内含 Python 脚本和测试用例,演示如何将以 OpenAI 兼容格式编码的消息转换为模型的输入字符串,以及如何解析模型的文本输出。完整文档请参阅编码文件夹。
简单示例如下:
from encoding_dsv4 import encode_messages, parse_message_from_completion_text
messages = [
{"role": "user", "content": "hello"},
{"role": "assistant", "content": "Hello! I am DeepSeek.", "reasoning_content": "thinking..."},
{"role": "user", "content": "1+1=?"}
]
prompt = encode_messages(messages, thinking_mode="thinking")
import transformers
tokenizer = transformers.AutoTokenizer.from_pretrained("deepseek-ai/DeepSeek-V4-Pro")
tokens = tokenizer.encode(prompt)
关于在本地运行 DeepSeek-V4 的详细说明,包括模型权重转换和交互式聊天演示,请参阅推理文件夹。
本地部署时,建议将采样参数设置为 temperature = 1.0、top_p = 1.0。对于 Think Max 推理模式,建议将上下文窗口设置为至少 384K tokens。
本仓库及模型权重采用 MIT 许可证授权。
@misc{deepseekai2026deepseekv4,
title={DeepSeek-V4: Towards Highly Efficient Million-Token Context Intelligence},
author={DeepSeek-AI},
year={2026},
}
如有任何问题,请提交 issue 或联系 service@deepseek.com。