迈向虚拟细胞的自主机制推理
阅读原文· arxiv.org研究团队提出VCR-Agent多智能体框架,将生物推理形式化为机制动作图以实现可验证的自主推理。该框架整合生物学知识检索与验证过滤机制,并基于Tahoe-100M图谱发布VC-TRACES数据集,提供经过验证的机制解释。实验表明,利用该数据集训练可显著提升事实准确性,并为基因表达预测任务提供更有效的监督信号。
Large language models (LLMs) have recently gained significant attention as a promising approach to accelerate scientific discovery. However, their application in open-ended scientific domains such as biology remains limited, primarily due to the lack of factually grounded and actionable explanations. To address this, we introduce a structured explanation formalism for virtual cells that represents biological reasoning as mechanistic action graphs, enabling systematic verification and falsification. Building upon this, we propose VCR-Agent, a multi-agent framework that integrates biologically grounded knowledge retrieval with a verifier-based filtering approach to generate and validate mechanistic reasoning autonomously. Using this framework, we release VC-TRACES dataset, which consists of verified mechanistic explanations derived from the Tahoe-100M atlas. Empirically, we demonstrate that training with these explanations improves factual precision and provides a more effective supervision signal for downstream gene expression prediction. These results underscore the importance of reliable mechanistic reasoning for virtual cells, achieved through the synergy of multi-agent and rigorous verification.