Rohan Paul@rohanpaul_ai

2026-06-30 07:19·3天前

AI 摘要

Google 新论文提出“验证债务”概念：AI 加快论文产出，但人工核查成为瓶颈。为此推出智能体验证（agentic verification）方案，并开发 Paper Assistant Tool 原型系统。该系统将论文拆解为多个部分，深入检查难点并汇总审稿意见，聚焦证明错误、实验漏洞、缺失对比等客观错误，而非直接给出接收/拒稿决策。在数学与计算机科学已知错误测试中，该工具比单次模型调用发现更多证明错误；在 STOC 和 ICML 的面向作者试点中，许多作者据此修复了严重理论缺陷或补充了实验。论文指出科学审稿可能需要独立 AI 栈以应对日益自动化的论文生成。

Big new paper release of Google for external agentic verification for science.

Science now needs AI review agents because AI is making papers faster than humans can check them.

The problem is that AI can help produce more research， but the slow part is still checking whether the work is actually correct.

The paper frames this as verification debt， where every faster research workflow creates more claims， proofs， experiments， and comparisons that someone still has to inspect.

Its main proposal is agentic verification， where AI agents help review papers by splitting them into parts， checking difficult sections deeply， and combining the findings into a review.

Google's Paper Assistant Tool is the example system， and it focuses on objective checks like proof errors， experimental gaps， missing comparisons， and unclear claims rather than final accept or reject decisions.

The authors tested it on known math and computer science paper errors and in author-facing pilots at STOC and ICML， where authors used it before submission.

The striking result is that Paper Assistant Tool found far more known proof errors than a single model call， and many authors said it led them to fix serious theory gaps or run new experiments.

The big deal is that scientific review may need its own AI stack， with review agents， clear roles， and human oversight， because paper generation is becoming partly automated too.

----

Link - arxiv. org/abs/2606.28277

Title： "Towards Automating Scientific Review with Google's Paper Assistant Tool"

Rohan Paul@rohanpaul_ai · X

65导出 Markdown