# 跨异构任务的自进化 LLM 记忆提取

- 来源：HuggingFace Daily Papers（社区热门论文）
- 发布时间：2026-04-13 08:00
- AIHOT 链接：https://aihot.virxact.com/items/cmoaukh5r049nsl1yerktc5fl
- 原文链接：https://arxiv.org/abs/2604.11610

## AI 摘要

研究人员针对大语言模型在异构任务中的记忆提取难题，提出基于聚类的自进化策略CluE，并发布涵盖18个数据集的BEHEMOTH基准测试。该基准覆盖个性化、问题解决和智能体任务，采用下游效用驱动指标评估。实验表明，传统静态提示无法跨任务通用，现有自进化框架在异构场景下性能衰减，而CluE通过分簇独立分析与跨簇综合优化，实现9.04%的相对性能提升，有效解决了异构任务中的记忆提取挑战。

## 正文

As LLM-based assistants become persistent and personalized, they must extract and retain useful information from past conversations as memory. However, the types of information worth remembering vary considerably across tasks. We formalize the heterogeneous memory extraction task and introduce BEHEMOTH, a benchmark that repurposes 18 existing datasets spanning personalization, problem-solving, and agentic tasks, using a downstream utility-driven metric for systematic evaluation. Our empirical analysis confirms that no single static extraction prompt dominates across all task categories, and that existing self-evolving prompt optimization frameworks, originally designed for homogeneous distributions, degrade when training tasks are heterogeneous. To address this, we propose CluE, a cluster-based self-evolving strategy that groups training examples into clusters by extraction scenarios, analyzes each cluster independently, and synthesizes cross-cluster insights to update the extraction prompt. Experiments on BEHEMOTH show that CluE generalizes effectively across heterogeneous tasks (+9.04\% relative gain), consistently outperforming prior self-evolving frameworks.
