elvis@omarsar0

2026-06-01 01:30·32天前

AI 摘要

该论文指出，当AI智能体在多轮对话中重复使用相同文档和历史记录时，固定的上下文策略并非最优。研究提出了“效率前沿”框架，将上下文策略选择建模为一个成本与性能的平衡问题。通过引入重用参数N进行扫描，可以识别出检索、压缩或全上下文各自占据优势的交叉区域。在5000个HotpotQA实例上的测试表明，部署感知的选择能在保持相同性能下减少约25%的有效token使用量，而摊销内存压缩在高性能设置下比全上下文提示的运行成本便宜超过50%。

// The Efficiency Frontier //

Cool paper on context management.

As agents reuse the same documents and histories across many turns， the cheapest context strategy is not fixed. This work describes a principled rule for picking one per deployment instead of defaulting to whatever topped a benchmark in isolation.

Retrieval and compression methods are almost always benchmarked on accuracy and cost separately， so you never learn when one actually beats another under real load.

The Efficiency Frontier models context strategy selection as a single cost-performance problem， with a log-utility term for diminishing returns from extra context and a reuse parameter N that amortizes preprocessing across repeated queries.

Sweep N and the optimal strategy changes， exposing crossover regions where retrieval， compression， or full context each wins. On 5，000 HotpotQA instances， deployment-aware selection cuts effective token usage about 25 percent at the same performance， and amortized memory compression runs over 50 percent cheaper than full-context prompting in higher-performance settings.

Paper： https://arxiv.org/abs/2605.23071

Learn to build effective AI agents in our academy： https://academy.dair.ai/

智能体arXiv检索增强论文/研究

在 X 查看原推导出 Markdown

elvis@omarsar0 · X

60导出 Markdown

2026-06-01 01:30·32天前

在 X 看原推· x.com

AI 摘要

// The Efficiency Frontier //

Cool paper on context management.

Retrieval and compression methods are almost always benchmarked on accuracy and cost separately， so you never learn when one actually beats another under real load.