Self-Compact：让语言模型智能体自行决定何时压缩轨迹

2026-06-22 08:00·11天前

AI 摘要

长期agent轨迹会积累陈旧内容，最终超出上下文窗口。现有固定token阈值压缩忽略轨迹结构，可能丢失中间结果。SelfCompact提供压缩工具供模型调用，并配套轻量级规则指明触发时机（子任务完成或轨迹收敛）与抑制时机（中途推导或卡住），实现自适应压缩，无需微调或外部监督。在六个基准及七种模型上，SelfCompact以远低于固定间隔压缩的token成本达到相近或更优效果：数学相比无压缩基线最高提升18.1分，智能体搜索提升5–9分，每题成本降低30–70%。

原文 · 未翻译

Long agent traces composed of chains of thought and tool calls accumulate stale content that anchor subsequent generations, and eventually outgrow the context window. Existing scaffolds mitigate it with fixed-interval compaction triggered at a token threshold. Such triggers pay no heed to trajectory structure, risking discard of partial results mid-derivation or mid-search. We propose SelfCompact, a scaffold that allows the model itself to decide when and how to compact. Specifically, it pairs two inference-time elements: (i) a compaction tool the model invokes to summarize the accumulated context, and (ii) a lightweight rubric specifying when to fire (a sub-task has resolved, or the trajectory is converging) and when to suppress (mid-derivation, or when stuck). Both are needed. The tool alone is unevenly used across open-weight models, often invoked at unhelpful moments or not at all; the rubric alone cannot act. Together, they elicit effective adaptive compaction without any fine-tuning or external supervision. We present empirical results on six benchmarks (competitive math and agentic search) and seven models. Our results show that SelfCompact matches or exceeds fixed-interval summarization at a fraction of the token cost, improving over a no-summarization baseline by up to 18.1 points on math and 5-9 points on agentic search at 30-70% lower per-question cost. Our results expose a meta-cognitive gap: although unprompted models cannot reliably tell when their own context is rotting, a lightweight rubric closes this gap, reframing when to compact as a capability that scaffolds can supply without training.

HuggingFace Daily Papers（社区热门论文）

61导出 Markdown

Self-Compact：让语言模型智能体自行决定何时压缩轨迹

2026-06-22 08:00·11天前

阅读原文· arxiv.org

AI 摘要

原文 · 保持原样，未翻译