祝贺 @vllm_project 和 @lmsysorg 在 CUDA 和 ROCm 堆栈上于第 0 天发布 MiniMax M3 428B!MiniMax M3 包含: 🟠 块稀疏注意力,预填充比 M2.7 快 9 倍 🟠 第 0 天开放 MXFP8 权重 🟠 此外,@Inferact 发布了第 0 天 EAGLE3 开放权重草稿模型支持 期待尝试 MiniMax M3 的性能!
Congrats to @vllm_project & @lmsysorg for releasing MiniMax M3 428B on both the CUDA & ROCm stack on day 0! MiniMax M3 includes:
🟠 Block sparse attention which is 9x faster prefill over M2.7 🟠 Day 0 open MXFP8 weights 🟠 and Furthermore @Inferact released Day-0 EAGLE3 open weight draft model support
Excited to try out the performance on MiniMax M3!