# Humanoid-GPT：通过规模化数据与结构实现零样本动作追踪

- 来源：HuggingFace Daily Papers（社区热门论文）
- 发布时间：2026-06-02 08:00
- AIHOT 分数：56
- AIHOT 链接：https://aihot.virxact.com/items/cmpxgncg903s3slckndkubui5
- 原文链接：https://arxiv.org/abs/2606.03985

## AI 摘要

Humanoid-GPT是一个基于GPT架构的Transformer模型，专为人形机器人全身控制设计。它在一个包含20亿帧的重定向运动语料库上进行预训练，该语料库统一了主要的动作捕捉数据集与大规模内部录制数据。通过扩展数据规模和模型容量，Humanoid-GPT成为了能够追踪高动态行为的单一生成式Transformer，并对未见过的运动和控制任务展现出前所未有的零样本泛化能力。实验证明，该模型在零样本泛化至新任务的同时，能稳健地追踪复杂动态动作，建立了新的性能前沿。

## 正文

We introduce Humanoid-GPT, a GPT-style Transformer with causal attention trained on a billion-scale motion corpus for whole-body control. Unlike prior shallow MLP trackers constrained by scarce data and an agility-generalization trade-off, Humanoid-GPT is pre-trained on a 2B-frame retargeted corpus that unifies all major mocap datasets with large-scale in-house recordings. Scaling both data and model capacity yields a single generative Transformer that tracks highly dynamic behaviors while achieving unprecedented zero-shot generalization to unseen motions and control tasks. Extensive experiments and scaling analyses show that our model establishes a new performance frontier, demonstrating robust zero-shot generalization to unseen tasks while simultaneously tracking highly dynamic and complex motions.
