零样本世界模型是发展高效的学习者

2026-04-11 08:00·83天前

AI 摘要

研究团队提出零样本视觉世界模型（ZWM），基于稀疏时间分解预测器、近似因果推理和推理组合三大原则，仅从单个儿童的第一人称经验中学习，即可快速掌握深度、运动、物体连贯性等多项物理理解能力。该模型在多个基准测试中展现出数据高效性，不仅重现了儿童发展的行为特征，还构建了类脑内部表征，为开发类人数据效率的AI系统提供了新路径。

原文 · 未翻译

Young children demonstrate early abilities to understand their physical world, estimating depth, motion, object coherence, interactions, and many other aspects of physical scene understanding. Children are both data-efficient and flexible cognitive systems, creating competence despite extremely limited training data, while generalizing to myriad untrained tasks -- a major challenge even for today's best AI systems. Here we introduce a novel computational hypothesis for these abilities, the Zero-shot Visual World Model (ZWM). ZWM is based on three principles: a sparse temporally-factored predictor that decouples appearance from dynamics; zero-shot estimation through approximate causal inference; and composition of inferences to build more complex abilities. We show that ZWM can be learned from the first-person experience of a single child, rapidly generating competence across multiple physical understanding benchmarks. It also broadly recapitulates behavioral signatures of child development and builds brain-like internal representations. Our work presents a blueprint for efficient and flexible learning from human-scale data, advancing both a computational account for children's early physical understanding and a path toward data-efficient AI systems.

HuggingFace Daily Papers（社区热门论文）

导出 Markdown

零样本世界模型是发展高效的学习者

2026-04-11 08:00·83天前

阅读原文· arxiv.org

AI 摘要

原文 · 保持原样，未翻译