Foresight:基于动作条件世界模型潜在表示的长时域机器人操作故障检测
阅读原文· arxiv.orgForesight 是一个利用动作条件世界模型潜在表示来监控操作轨迹的故障检测框架,仅使用最终任务级成功/失败标签训练。它通过预测性世界模型嵌入为不同策略提供统一的故障检测,并用功能共形预测(FCP)自适应校准阈值。在 LIBERO-Long、ManiSkill-Long、BEHAVIOR-1K 仿真环境及真实机器人(ReactorX-200 机械臂三项任务、Franka 机械臂一项任务)上验证,结果表明该嵌入为长时域操作中的可靠故障监控提供了可扩展表示。
Long-horizon tasks are common in real-world robotic deployments, yet failure detection for such tasks remains underexplored. Detecting failures in long-horizon robotic tasks is particularly challenging because failure onset is often ambiguous and dense temporal annotations are typically unavailable. We present Foresight, a failure detection framework that monitors manipulation trajectories using latent representations from an action-conditioned world model. Foresight is trained using only final task-level success or failure labels. By leveraging predictive world-model embeddings, our method provides a unified framework for failure detection across different policies. We further use functional conformal prediction (FCP) to calibrate detection thresholds adaptively. We evaluate Foresight with state-of-the-art vision-language-action policies in simulation on LIBERO-Long, ManiSkill-Long, and BEHAVIOR-1K, compare it against state-of-the-artfailure detection methods, and validate it on real robots with three long-horizon tasks on a ReactorX-200 arm and one task on a Franka arm. Our results suggest that action-conditioned world-model embeddings provide a scalable representation for reliable failure monitoring in long-horizon manipulation.