SemiAnalysis@SemiAnalysis_

2026-06-13 08:00·20天前

AI 摘要

祝贺 @vllm_project 和 @lmsysorg 在 CUDA 和 ROCm 堆栈上于第 0 天发布 MiniMax M3 428B！MiniMax M3 包含： 🟠 块稀疏注意力，预填充比 M2.7 快 9 倍 🟠 第 0 天开放 MXFP8 权重 🟠 此外，@Inferact 发布了第 0 天 EAGLE3 开放权重草稿模型支持期待尝试 MiniMax M3 的性能！

Congrats to @vllm_project & @lmsysorg for releasing MiniMax M3 428B on both the CUDA & ROCm stack on day 0！ MiniMax M3 includes：

🟠 Block sparse attention which is 9x faster prefill over M2.7 🟠 Day 0 open MXFP8 weights 🟠 and Furthermore @Inferact released Day-0 EAGLE3 open weight draft model support

Excited to try out the performance on MiniMax M3！

开源生态推理模型发布部署/工程

在 X 查看原推导出 Markdown

SemiAnalysis@SemiAnalysis_ · X

63导出 Markdown

2026-06-13 08:00·20天前

在 X 看原推· x.com

AI 摘要

Congrats to @vllm_project & @lmsysorg for releasing MiniMax M3 428B on both the CUDA & ROCm stack on day 0！ MiniMax M3 includes：

🟠 Block sparse attention which is 9x faster prefill over M2.7 🟠 Day 0 open MXFP8 weights 🟠 and Furthermore @Inferact released Day-0 EAGLE3 open weight draft model support

Excited to try out the performance on MiniMax M3！

开源生态推理