# AI评估挑战：数学问题单一，亟需多样化难题库

- 来源：Ethan Mollick (@emollick)
- 发布时间：2026-05-26 04:44
- AIHOT 分数：56
- AIHOT 链接：https://aihot.virxact.com/items/cmploq0t00hn5sl01pb2448wp
- 原文链接：https://x.com/emollick/status/2059012803009151444

## AI 摘要

推文指出，当前用于推动AI能力发展的困难问题过于集中于数学领域（如Erdős问题）。虽然数学易于验证，但其成果对日常生活的直接影响不够明确。作者呼吁需要为包括工程、经济、物理、生物等在内的更多领域建立困难问题库，并配套制定相应的评估方法，以让AI智能体处理更复杂、答案更不明确的任务。

## 正文

Its very limiting that a big set of very hard problems that we have just lying around are Erdos problems. Don't get me wrong， they are quite cool， but we really need hard problems repositories for many fields， including areas that have less specified answers & require judges.

Yes， math is the easiest field in which to do verified work， but it is also an area where direct implications of increasing AI ability on everyday life are less clear. We need more types of problems （complex engineering problems， large data sets in economics， physics， biology）， for people to turn AI loose on， including speciations of how to evaluate them.