# RATs：玩耍式智能体机器人学习

- 来源：HuggingFace Daily Papers（社区热门论文）
- 发布时间：2026-06-17 08:00
- AIHOT 分数：47
- AIHOT 链接：https://aihot.virxact.com/items/cmqkibabt06ihslhikzd4uyrp
- 原文链接：https://arxiv.org/abs/2606.19419

## AI 摘要

论文提出Playful Agentic Robot Learning范式，让具身编码智能体在任务到达前自主玩耍持续学技能。RATs（机器人智能体团队）在玩耍阶段自主提出可学新探索任务，执行代码策略、诊断失败并重试，将成功执行蒸馏为持久化代码技能库。测试时从冻结库检索技能辅助新任务。在LIBERO-PRO和MolmoSpaces上，玩耍学习技能相比CaP-Agent0分别提升20.6和17.0个百分点；该技能库可直接插入其他推理时代码策略智能体，无需微调模型，在RoboSuite和真实世界迁移中分别提升8.9和8.8个百分点。

## 正文

Current agentic robot systems can write executable Code-as-Policy programs, observe feedback, and revise behavior across multiple attempts, but they remain largely task-driven: reusable skills are acquired only after explicit instructions. We study Playful Agentic Robot Learning, where an embodied coding agent uses self-directed play as a continual skill-learning stage before downstream tasks arrive. We introduce RATs, Robotics Agent Teams designed for play-time skill acquisition. During play, RATs proposes novel yet learnable exploratory tasks, plans and executes robot-code policies, verifies intermediate progress, diagnoses failures, retries with dense, step-level feedback, and distills successful executions into a persistent code skill library. At test time, the agent reuses relevant skills from this frozen library to help solve new tasks. Experiments in LIBERO-PRO and MolmoSpaces show that play-learned skills improve held-out downstream tasks over no-play and random-play baselines, with 20.6 and 17.0 percentage-point gains over CaP-Agent0 on LIBERO-PRO and MolmoSpaces, respectively. Moreover, the learned skills can be plugged into other inference-time Code-as-Policy agents by simply retrieving them into the context, improving RoboSuite and real-world transfer by 8.9 and 8.8 points, respectively, without finetuning the underlying model.