# Mem-π：通过学习何时与生成何物实现的自适应记忆

- 来源：HuggingFace Daily Papers（社区热门论文）
- 发布时间：2026-05-20 08:00
- AIHOT 分数：65
- AIHOT 链接：https://aihot.virxact.com/items/cmpewx9yf01eosljwj6rly4fq
- 原文链接：https://arxiv.org/abs/2605.21463

## AI 摘要

Mem-π是一个用于大型语言模型代理的自适应记忆框架，它通过专门的模型按需生成指导内容，而非从外部记忆库检索静态信息。该框架采用决策-内容解耦的强化学习方法，使模型能自主判断是否生成指导及生成何种内容。在涵盖网页导航、终端工具使用等多样化的代理任务基准测试中，Mem-π性能持续优于检索式方法和现有强化学习记忆方案，其中在网页导航任务上实现了超过30%的相对提升。

## 正文

We present Mem-π, a framework for adaptive memory in large language model (LLM) agents, where useful guidance is generated on demand rather than retrieved from external memory stores. Existing memory-augmented agents typically rely on similarity-based retrieval from episodic memory banks or skill libraries, returning static entries that often misalign with the current context. In contrast, Mem-π uses a dedicated language or vision-language model with its own parameters, separate from the downstream agent, to generate context-specific guidance for complex tasks. Conditioned on the current agent context, the model jointly decides when to produce guidance and what guidance to produce. We train it with a decision-content decoupled reinforcement learning (RL) objective, enabling it to abstain when generation would not help and otherwise produce concise, useful guidance. Across diverse agentic benchmarks spanning web navigation, terminal-based tool use, and text-based embodied interaction, Mem-π consistently outperforms retrieval-based and prior RL-optimized memory baselines, achieving over 30% relative improvement on web navigation tasks.