Kairos:面向Physical AI的原生世界模型栈
阅读原文· arxiv.orgKairos是面向Physical AI的原生世界模型栈。它采用跨具身数据课程进行原生预训练,融合开放世界视频、人类行为数据和机器人交互。其统一架构配备混合线性时间注意力:滑动窗口捕获局部动态,扩张滑动窗口捕获中距离依赖,门控线性注意力维持持久全局记忆,理论上保证长时域状态传播误差可控。通过部署感知系统协同设计,在服务器和消费级硬件上实现低延迟的观察-行动-反馈循环。在具身世界模型、长时域和行为策略基准上,Kairos达到顶级性能并展现强效率-能力权衡。
World models are transitioning from passive visual generators to foundational, operational infrastructure for Physical AI: they must natively acquire world knowledge from heterogeneous experience, maintain persistent states over long horizons, and execute efficiently within real deployment constraints. We introduce Kairos, a native world model stack designed around these requirements. (1) Kairos learns the world by pioneering a Native Pre-training Paradigm governed by a Cross-Embodiment Data Curriculum, which organizes open-world videos, human behavioral data, and robot interactions into a progressive developmental pathway. (2) Kairos maintains the world by unified world understanding, generation, and prediction within a Native Unified Architecture equipped with Hybrid Linear Temporal Attention, where sliding-window attention captures local dynamics, dilated sliding windows capture mid-range dependencies, and gated linear attention maintains persistent global memory. We establish formal theoretical bounds demonstrating that this temporal factorization strictly limits error accumulation, mathematically guaranteeing state propagation across extended horizons. (3) Kairos runs the world by incorporating a Deployment-Aware System Co-Design to support low-latency rollout generation on server and consumer-grade hardware for real-world observation-action-feedback loops. Experiments on embodied world-model, long-horizon, and action-policy benchmarks show that Kairos achieves top level performance while offering a strong efficiency-capability trade-off. Together, these results position Kairos as a cohesive operational foundation for future self-evolving physical intelligence.