# AnchorWorld：基于视图演化定制的具身自我中心世界模拟

- 来源：HuggingFace Daily Papers（社区热门论文）
- 发布时间：2026-06-05 08:00
- AIHOT 分数：64
- AIHOT 链接：https://aihot.virxact.com/items/cmq4slb4n01brslt29ngrj82m
- 原文链接：https://arxiv.org/abs/2606.07326

## AI 摘要

AnchorWorld 提出一种具身自我中心世界模拟框架，通过增强交互完整性与灵活的世界定制机制提升实际场景可控性。该框架以 3D 人体运动为主要交互模态，引入与第一人称传感器解耦的外部视角辅助监督，使模型能观察全身相对环境的定位，从而稳健建模人-世界交互。此外，在世界坐标系内定义锚定视图并配合描述局部场景演变的文本，实现简单有效的世界自我演进定制。实验结果显示，AnchorWorld 显著优于现有基线，消融研究验证了关键设计的有效性，定制方案展现出良好的时空几何一致性并严格遵循预设演化规则。

## 正文

Despite being a pivotal frontier, interactive world modeling remains underexplored in terms of the versatile controllability required by practical scenarios. To bridge this gap, we present AnchorWorld, a framework that advances egocentric simulation through enhanced interaction integrity and a flexible mechanism for world customization. First, we utilize 3D human motion as the primary interaction modality. To complement the out-of-view or truncated body parts in egocentric views, we introduce an auxiliary training supervision that incorporates exogenous viewpoints decoupled from the agent's first-person sensorium. It allows the model to observe the agent's full-body positioning relative to the environment, facilitating a more robust spatial grounding of human-world interactions. Furthermore, we propose a simple yet effective mechanism for customizing self-evolving worlds. This is achieved by defining anchor views within a unified world coordinate system, coupled with textual descriptions dictating the dynamic evolution of local scenes. Experimental results show that AnchorWorld significantly outperforms state-of-the-art baselines, while ablation studies validate the effectiveness of our key designs. Notably, our customization scheme exhibits promising spatio-temporal geometric consistency and adheres strictly to the prescribed evolutionary dynamics.
