MiniMax M3 开源模型发布：1M-token 上下文与 MSA 稀疏注意力

MiniMax (official)@MiniMax_AI

2026-06-13 10:33·8天前

AI 摘要

MiniMax 发布全新开源模型 M3，具备前沿编码、智能体能力、原生图像视频输入、Computer Use 及 1M-token 上下文窗口。核心采用 MSA 稀疏注意力架构：每个 query 仅对 128-token 的 KV 块打分，只关注 top 块，使超长上下文实际可部署。M3 在 vLLM 获 Day-0 支持，已在 NVIDIA 和 AMD 硬件验证，包括 MSA 专用 prefill/decode kernel、1M-token 上下文服务（prefix caching + chunked prefill）、BF16/MXFP8 检查点（Hopper 和 Blackwell 的 MoE 后端）、原生多模态输入，以及工具调用、推理解析和思考模式控制等功能。

the kernels are doing the lord's work today， day-0 on @vllm_project， verified on nvidia and amd.

go read the writeup 👇

vLLM🎉 Congrats to @MiniMax_AI on releasing MiniMax M3! Frontier coding and agentic capabilities, native image and video input, computer use, and a 1M-token context...

多模态开源生态推理模型发布

在 X 查看原推

MiniMax (official)@MiniMax_AI · X