Ethan Mollick@emollick

2026-05-30 22:55·33天前

AI 摘要

Epoch AI 使用其综合指标 Epoch Capabilities Index 测量发现，开源模型与闭源模型的能力差距平均约为三个月。但主推文作者对此表示怀疑，认为开源大语言模型的实际表现（尤其是在分布外任务上）比评测分数所显示的更为脆弱，真实的体感差距可能远不止三四个月。

I think Epoch does a great job benchmarking， but I continue to believe that open weights models are much more fragile， especially out-of-distribution， than their benchmarks indicate. Vibe-wise， I don't think they were only 3 months behind last year or only 4 months behind today.

Epoch AIWe measure the gap using the Epoch Capabilities Index, our aggregate measure of model capability. Compared to our last analysis, the gap has widened slightly - ...

大佬观点开源生态评测/基准

在 X 查看原推导出 Markdown

Ethan Mollick@emollick · X

61导出 Markdown

2026-05-30 22:55·33天前

在 X 看原推· x.com

AI 摘要

Epoch AIWe measure the gap using the Epoch Capabilities Index, our aggregate measure of model capability. Compared to our last analysis, the gap has widened slightly - ...

大佬观点开源生态评测/基准