# SAVOIR：基于Shapley值奖励归因学习社交智能

- 来源：HuggingFace Daily Papers（社区热门论文）
- 发布时间：2026-04-21 08:00
- AIHOT 链接：https://aihot.virxact.com/items/cmob7jflt06fcsl1ymxcyague
- 原文链接：https://arxiv.org/abs/2604.18982

## AI 摘要

研究团队提出基于合作博弈论的SAVOIR框架，结合期望效用（前瞻性评估话语的战略潜力）与Shapley值（公理化保证公平信用分配），解决多轮对话强化学习中的信用分配难题。在SOTOPIA基准测试中，该框架取得全新SOTA成绩，7B参数模型性能匹敌甚至超越GPT-4o和Claude-3.5-Sonnet。实验还发现大型推理模型在社交智能任务上持续表现不佳，揭示社交能力与分析推理存在本质差异。

## 正文

Social intelligence, the ability to navigate complex interpersonal interactions, presents a fundamental challenge for language agents. Training such agents via reinforcement learning requires solving the credit assignment problem: determining how individual utterances contribute to multi-turn dialogue outcomes. Existing approaches directly employ language models to distribute episode-level rewards, yielding attributions that are retrospective and lack theoretical grounding. We propose SAVOIR (ShApley Value fOr SocIal RL), a novel principled framework grounded in cooperative game theory. Our approach combines two complementary principles: expected utility shifts evaluation from retrospective attribution to prospective valuation, capturing an utterance's strategic potential for enabling favorable future trajectories; Shapley values ensure fair credit distribution with axiomatic guarantees of efficiency, symmetry, and marginality. Experiments on the SOTOPIA benchmark demonstrate that SAVOIR achieves new state-of-the-art performance across all evaluation settings, with our 7B model matching or exceeding proprietary models including GPT-4o and Claude-3.5-Sonnet. Notably, even large reasoning models consistently underperform, suggesting social intelligence requires qualitatively different capabilities than analytical reasoning.
