AutoResearchClaw: 具备自我强化与人机协作的自主研究系统
阅读原文· arxiv.orgAutoResearchClaw是一种旨在突破现有系统线性流程局限的多智能体自主研究系统。其核心在于五大机制:结构化多智能体辩论用于假设生成与分析;具备自修复能力的执行器可将失败转化为信息;可验证的结果报告防止数据伪造与引用幻觉;提供从全自动到逐步监督的七种人机协作模式;以及能将过往经验转化为未来保障的跨运行进化能力。实验表明,该系统性能显著优于基线模型,且精准、定向的人机协作模式始终优于完全自主或穷举式监督。它被定位为一种增强而非取代人类科研判断力的研究放大器。
Automating scientific discovery requires more than generating papers from ideas. Real research is iterative: hypotheses are challenged from multiple perspectives, experiments fail and inform the next attempt, and lessons accumulate across cycles. Existing autonomous research systems often model this process as a linear pipeline: they rely on single-agent reasoning, stop when execution fails, and do not carry experience across runs. We present AutoResearchClaw, a multi-agent autonomous research pipeline built on five mechanisms: structured multi-agent debate for hypothesis generation and result analysis, a self-healing executor with a Pivot/Refine decision loop that transforms failures into information, verifiable result reporting that prevents fabricated numbers and hallucinated citations, human-in-the-loop collaboration with seven intervention modes spanning full autonomy to step-by-step oversight, and cross-run evolution that converts past mistakes into future safeguards. On ARC-Bench, a 25-topic experiment-stage benchmark, AutoResearchClaw outperforms AI Scientist v2 by 54.7%. A human-in-the-loop ablation across seven intervention modes reveals that precise, targeted collaboration at high-leverage decision points consistently outperforms both full autonomy and exhaustive step-by-step oversight. We position AutoResearchClaw as a research amplifier that augments rather than replaces human scientific judgment. Code is available at https://github.com/aiming-lab/AutoResearchClaw.