面向跨会话个性化工具调用的潜在偏好建模

2026-04-20 08:00·74天前

AI 摘要

针对用户请求常省略关键细节导致工具调用输入不完整的问题，研究者推出MPT基准测试与PRefine方法。MPT包含265个多会话对话，涵盖偏好回忆、归纳与迁移三大挑战。PRefine通过生成-验证-精炼循环将用户偏好建模为动态假设，从历史提取可复用约束，在仅消耗全历史提示1.24% token的情况下提升工具调用准确率。研究表明，有效的个性化需捕获用户选择背后的原因而非仅记录选择本身。

原文 · 未翻译

Users often omit essential details in their requests to LLM-based agents, resulting in under-specified inputs for tool use. This poses a fundamental challenge for tool-augmented agents, as API execution typically requires complete arguments, highlighting the need for personalized tool calling. To study this problem, we introduce MPT, a benchmark comprising 265 multi-session dialogues that cover three challenges: Preference Recall, Preference Induction, and Preference Transfer. We also propose PRefine, a test-time memory-augmented method that represents user preferences as evolving hypotheses. Through a generate--verify--refine loop, it extracts reusable constraints from history and improves tool-calling accuracy while using only 1.24% of the tokens required by full-history prompting. These results indicate that robust personalization in agentic systems depends on memory that captures the reasons behind user choices, not just the choices themselves.

HuggingFace Daily Papers（社区热门论文）

导出 Markdown

面向跨会话个性化工具调用的潜在偏好建模

2026-04-20 08:00·74天前

阅读原文· arxiv.org

AI 摘要

原文 · 保持原样，未翻译