# AI数学测试解7/10难题仍被指未达标

- 来源：Ethan Mollick (@emollick)
- 发布时间：2026-06-15 23:28
- AIHOT 分数：53
- AIHOT 链接：https://aihot.virxact.com/items/cmqfdtmxc00ihsl2aw78ly36h
- 原文链接：https://x.com/emollick/status/2066543254644928885

## AI 摘要

奇怪的标题——我不确定解决10个极其困难的新问题中的7个就意味着AI“没有完成任务”，而15个月前大语言模型还不会做数学。

但实际研究很有趣，揭示了AI在数学中的缺陷与成功。https://1stproof.org/assets/docs/report.pdf

[引用 @Nature]：人工智能经历了其最严谨的数学测试，然而它并未完成任务

https://go.nature.com/4oqlNk6

## 正文

Weird headline - I am not sure solving 7 out of 10 novel very hard problems meant AI "did not live up to the task，" when 15 months ago LLMs couldn't do math.

But the actual study is interesting and illuminates flaws &amp； successes of AIs in math. https://1stproof.org/assets/docs/report.pdf

### 引用推文

> nature：Artificial intelligence has undergone its most scrupulous maths test yet, and it did not live up to the task https://go.nature.com/4oqlNk6
