WIZARD：基于权重空间元学习的机器人策略适应

2026-06-05 08:00·28天前

AI 摘要

针对视觉-语言-动作（VLA）模型部署成本高的问题，WIZARD提出权重空间元学习框架，仅需语言指令和简短演示视频，在一轮前向传播中为冻结的VLA策略生成任务特定LoRA参数，无需动作标签或测试时优化。在LIBERO上，WIZARD在未见过数据集集合上性能提升最高约2倍，在未见过任务上最高约14倍；在Franka Emika Panda真实机器人上，WIZARD持续优于域适应基线。

原文 · 未翻译

Vision-Language-Action (VLA) models are emerging as a promising paradigm for robotic manipulation, enabling general-purpose policies trained from large corpora of demonstrations and action labels. However, adapting these models to new tasks still typically requires task-specific demonstrations, action annotations, and additional fine-tuning, making deployment costly and difficult to scale. We propose WIZARD, a weight-space meta-learning framework that sidesteps task-specific fine-tuning by generating task-specific LoRA parameters for a frozen VLA policy. Given only a language instruction and a short demonstration video, WIZARD predicts the corresponding adaptation weights in a single forward pass, without target-task action labels or test-time optimization. During meta-training, WIZARD learns to map task evidence directly to expert LoRA updates, capturing relationships between tasks in weight space. Experiments on LIBERO show that WIZARD improves performance by up to ~2x on unseen dataset collections and up to ~14x on unseen tasks. On a Franka Emika Panda, WIZARD consistently improves over a real-domain adapted baseline, showing that generated adapters provide task-level specialization beyond simulation.

HuggingFace Daily Papers（社区热门论文）

61导出 Markdown

WIZARD：基于权重空间元学习的机器人策略适应

2026-06-05 08:00·28天前

阅读原文· arxiv.org

AI 摘要

原文 · 保持原样，未翻译