自发说服：日常对话中模型说服力的审计

2026-04-27 12:00·55天前·Nalin Poungpeth, Nicholas Clark, Tanu Mitra

精选理由

这篇论文揭示了一个被忽视的风险：LLM在日常对话中会无意识地、系统性地使用说服策略。对于设计AI产品和制定安全策略的人来说，这是必须理解的新行为模式。

AI 摘要

本研究引入“自发说服”概念，审计五种大型语言模型（LLM）在日常多轮对话中非明确使用说服策略的频率与方式。通过模拟基于心理学、传播学和语言学的用户回应风格，并与人类回应（来自Reddit）比较，发现LLM在几乎所有对话中都会自发说服用户，主要依赖基于信息的策略，如诉诸逻辑或量化证据。这一模式在不同模型和用户回应风格中保持一致，但在心理健康话题中，基于评价和情感的策略使用率更高。相比之下，人类回应更倾向于使用产生社会影响的策略，例如诉诸负面情感或非专家证言。这种差异可能解释了LLM在说服用户方面的有效性，以及其被感知为客观、公正的原因。

原文 · 未翻译

Computer Science > Human-Computer Interaction

[Submitted on 23 Apr 2026 (v1), last revised 27 Apr 2026 (this version, v2)]

Title:Spontaneous Persuasion: An Audit of Model Persuasiveness in Everyday Conversations

Authors:Nalin Poungpeth, Nicholas Clark, Tanu Mitra

View PDF HTML (experimental)

Abstract:Large language models (LLMs) possess strong persuasive capabilities that outperform humans in head-to-head comparisons. Users report consulting LLMs to inform major life decisions in relationships, medical settings, and when seeking professional advice. Prior work measures persuasion as intentional attempts at producing the most effective argument or convincing statement. This fails to capture everyday human-AI interactions in which users seek information or advice. To address this gap, we introduce "spontaneous persuasion," which characterizes the inexplicit use of persuasive strategies in everyday scenarios where persuasion is not necessarily warranted. We conduct an audit of five LLMs to uncover how frequently and through which techniques spontaneous persuasion appears in multi-turn conversations. To simulate response styles, we provide a user response taxonomy grounded in literature from psychology, communication, and linguistics. Furthermore, we compare the distribution of spontaneous persuasion produced by LLMs with human responses on the same topics, collected from Reddit. We find LLMs spontaneously persuade the user in virtually all conversations, heavily relying on information-based strategies such as appeals to logic or quantitative evidence. This was consistent across models and user response styles, but conversations concerning mental health saw higher rates of appraisal-based and emotion-based strategies. In comparison, human responses tended to invoke strategies that generate social influence, like negative emotion appeals and non-expert testimony. This difference may explain the effectiveness of LLM in persuading users, as well as the perception of models as objective and impartial.

Subjects:	Human-Computer Interaction (cs.HC); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
Cite as:	arXiv:2604.22109 [cs.HC]
	(or arXiv:2604.22109v2 [cs.HC] for this version)
	https://doi.org/10.48550/arXiv.2604.22109 arXiv-issued DOI via DataCite

Submission history

From: Nalin Poungpeth [view email]
[v1] Thu, 23 Apr 2026 23:01:38 UTC (496 KB)
[v2] Mon, 27 Apr 2026 04:41:35 UTC (496 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.HC

< prev | next >

new | recent | 2026-04

Change to browse by:

cs
cs.AI
cs.CL

References & Citations

Bookmark

Bibliographic and Citation Tools

Bibliographic Explorer (What is the Explorer?)

Connected Papers (What is Connected Papers?)

Litmaps (What is Litmaps?)

scite Smart Citations (What are Smart Citations?)

Code, Data and Media Associated with this Article

alphaXiv (What is alphaXiv?)

CatalyzeX Code Finder for Papers (What is CatalyzeX?)

DagsHub (What is DagsHub?)

Gotit.pub (What is GotitPub?)

Hugging Face (What is Huggingface?)

ScienceCast (What is ScienceCast?)

Demos

Replicate (What is Replicate?)

Hugging Face Spaces (What is Spaces?)

TXYZ.AI (What is TXYZ.AI?)

Recommenders and Search Tools

Influence Flower (What are Influence Flowers?)

CORE Recommender (What is CORE?)

Author
Venue
Institution
Topic

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

安全/对齐论文/研究

arXiv：cs.AI（全量分类）

精选72