AURA：面向机器人策略的恒定VRAM动作门控记忆

2026-06-02 02:38·31天前

AI 摘要

AURA-Mem是一种恒定大小的递归记忆机制，专为机器人策略设计。它包装了一个冻结的视觉-语言-动作主干（7B参数），通过学习门控仅在当前观测会改变下一步动作时写入记忆。推理状态固定为4,224字节，而KV-cache在100,000步时大6,061倍。在LIBERO-Long上，门控策略未降低成功率（0.233），略优于始终写入的KV臂（0.217），同时写入次数减少7.0倍。在合成基准上，AURA-Mem匹配最佳O(1)基线精度，写入次数减少5.19–6.13倍，而随机或周期调度无法复现该增益。

原文 · 未翻译

The KV-cache is the right memory for datacenters but the wrong memory for robots. Datacenter inference batches many short requests and resets them, amortizing an attention cache across a crowd. Embodied agents instead run one long, non-resetting episode on bandwidth-limited edge hardware, where high-bandwidth memory and flash are scarce, flash has finite write endurance, and memory writes rather than compute can become the binding constraint. AURA-Mem (Action-Utility Recurrent Adaptive Memory) targets this regime. It wraps a frozen vision-language-action backbone with a constant-size recurrent memory and a learned gate that writes only when the current observation would change the next action: memory that knows when to stay silent. Unlike reconstruction-based memory, the gate is trained directly against a closed-loop action-error signal. Its inference state is fixed at 4,224 bytes regardless of horizon, while a KV-cache grows to 6,061 times larger at 100,000 steps. On a controlled synthetic benchmark, AURA-Mem matches the best O(1) baseline in accuracy while using 5.19-6.13 times fewer writes, and up to 9.19 times fewer writes on easier configurations. Budget-matched random and periodic schedules do not recover this gain, isolating the benefit to the action-surprise signal. On a trained closed-loop OpenVLA-OFT 7B panel on LIBERO-Long (n=60 episodes per arm), the gate does not hurt success: AURA-Mem matches the ungated base policy (0.233) and slightly exceeds an always-write KV arm (0.217), while using 7.0 times fewer writes and constant memory. We also instantiate an approximate-information-state value-loss bound as a methodology demonstration; at this scale, the bound is vacuous rather than a guarantee.

HuggingFace Daily Papers（社区热门论文）

61导出 Markdown

AURA：面向机器人策略的恒定VRAM动作门控记忆

2026-06-02 02:38·31天前

阅读原文· arxiv.org

AI 摘要

原文 · 保持原样，未翻译