# MiniMax推出多模态模型M3，1M上下文，多项基准领先

- 来源：Artificial Analysis (@ArtificialAnlys)
- 发布时间：2026-06-09 03:25
- AIHOT 分数：59
- AIHOT 链接：https://aihot.virxact.com/items/cmq5mg47800cbsl5ilu6v4krh
- 原文链接：https://x.com/ArtificialAnlys/status/2064066303863005254

## AI 摘要

MiniMax推出首个多模态M系列模型M3，支持图像/视频输入及1M token上下文窗口。在Artificial Analysis Intelligence Index上得55分，超越开源权重的Kimi K2.6和MiMo-V2.5-Pro（均54）。相比前代M2.7，HLE提升9点至37%，GPQA Diamond提升6点至93%，多项基准均有进步。原生多模态MMMU-Pro约80%与GPT-5.5持平。定价$0.30/$1.20/1M tokens（512K内），512K-1M翻倍。权重计划约10天内开源。

## 正文

MiniMax-M3 scores 55 on the Artificial Analysis Intelligence Index. Once the weights are released， it will be the leading open weights model

M3 is @MiniMax_AI's first multimodal M-series model， adding image and video input and a 1M token context window over the text-only MiniMax-M2.7 （50）. At 55 on the Intelligence Index it sits just ahead of open weights peers Kimi K2.6 （54） and MiMo-V2.5-Pro （54）. MiniMax has noted they plan to release the weights within ~10 days. When MiniMax released the weights for M2.7， it was under a commercially restricted license.

Key takeaways：
➤ MiniMax-M3 improves on MiniMax-M2.7 across most evaluations. HLE +9 points （28% to 37%）， GPQA Diamond +6 （87% to 93%）， AA-LCR +5 （69% to 74%）， IFBench +7 （76% to 83%）， and CritPt +3 （1% to 4%）， with a small regression on SciCode （47% to 45%）
➤ M3 scores ~1670 on GDPval-AA， behind Claude Opus 4.8 （max， 1890） and GPT-5.5 （xhigh， 1769）， and level with Claude Sonnet 4.6 （max， 1676）. GDPval-AA measures real-world tasks across 44 occupations and 9 industries
➤ Native multimodality， scoring ~80% on MMMU-Pro. Level with GPT-5.5 （xhigh， 79.9%） and Kimi K2.6 （79.4%）， behind Gemini 3.5 Flash （high， 84.3%）. Not all open weights models support native vision input
➤ On AA-Omniscience， heavy abstention drives both low hallucination and low accuracy. M3 attempts only 30.9% of questions， the lowest among current peers， yielding a low hallucination rate （16.1%） and low accuracy （15.0%）
➤ MiniMax-M3's token usage is close to M2.7's， using ~91M output tokens to run the Intelligence Index （~81M reasoning） versus ~87M （~79M reasoning）， while scoring 5 points higher

Key model details：
➤ Context window： 1M tokens， up from MiniMax-M2.7's 200K
➤ Pricing： $0.30/$1.20 per 1M input/output tokens up to 512K context， rising to $0.60/$2.40 for 512K to 1M context
➤ Weights： Not yet released. MiniMax has stated the weights will follow
➤ Availability： MiniMax first-party API， @SiliconFlowAI， @gmi_cloud， and @novita_labs
