Chubby♨️@kimmonismus

2026-05-25 06:17·39天前

AI 摘要

Google DeepMind的AlphaProof Nexus系统自主解决了9个开放的Erdős问题（部分问题存在56年），每个问题的成本约几百美元。它还证明了44个OEIS猜想，解决了一个15年的代数几何问题，并在优化理论中发现了新算法参数。其核心机制是将大语言模型的推理能力与Lean形式化验证系统结合，Lean自动检查每一步逻辑，无需人工复核。研究发现，一个仅交替使用大语言模型生成与编译器反馈的基础智能体，便能复现全部9个Erdős问题的成功。该系统还能检测并修正现有数学文献中的表述错误。其局限在于成功案例集中于Lean数学库成熟的领域（如组合、数论），仍无法解决需要全新理论的大问题。

Google DeepMind's AlphaProof Nexus autonomously solved 9 open Erdős problems， some unsolved for 56 years， at a cost of a few hundred dollars per problem.

It also proved 44 open OEIS conjectures， resolved a 15-year-old question in algebraic geometry， and discovered a novel algorithmic parameter in optimization theory that humans hadn't found.

The core mechanism combines LLM reasoning （Gemini 3.1 Pro hype？！） with Lean formal verification. The AI generates proof attempts， Lean's compiler checks every logical step automatically. No human review needed to confirm correctness.

The most surprising finding： a basic agent that simply alternates LLM generation with compiler feedback replicated all 9 Erdős successes. The full-featured system with evolutionary search and reinforcement learning only provided meaningful advantages on the hardest problems.

This shows a more recent broader trend： as foundation models improve， simple agentic loops are catching up to complex specialized architectures . What sets this apart from OpenAI's informal proof approach： formal verification acts as an automatic filter. The failure analysis showed the AI frequently hallucinated lemmas it claimed were established results， and often disguised the core difficulty by rephrasing it as a helper lemma. Informal proofs would let these errors pass. Lean catches them immediately.

The agent also detected misformalizations in existing mathematical literature， correcting ambiguities in problem statements before solving the corrected versions. It served as both a solver and a diagnostic tool.

Current limitations are real. Successes cluster in combinatorics， number theory， and optimization where Lean's math library is mature. Problems requiring substantial new theory remain out of reach. Most Erdős problems still weren't solved tho.

Chubby♨️@kimmonismus · X

71导出 Markdown

2026-05-25 06:17·39天前

在 X 看原推· x.com

AI 摘要

Google DeepMind's AlphaProof Nexus autonomously solved 9 open Erdős problems， some unsolved for 56 years， at a cost of a few hundred dollars per problem.

It also proved 44 open OEIS conjectures， resolved a 15-year-old question in algebraic geometry， and discovered a novel algorithmic parameter in optimization theory that humans hadn't found.