审计基于LLM的在线讨论立场模拟：反事实语境修正框架

2026-06-04 08:00·29天前

AI 摘要

本研究提出反事实语境修正框架，用于审计LLM在模拟社交媒体用户立场时的语境敏感性。给定原始对话后，先推断目标用户立场，再对语境施加受控修正策略（纯文本与结合模因的多模态策略）并重新模拟。评估平均方向性立场转变与立场转换率，发现两种策略在不同极化偏好机制下均实现有效且稳健的立场转换。该框架揭示了LLM立场模拟的语境敏感性，同时突出了其模拟在线舆论动态的前景与风险。

原文 · 未翻译

Large language models are increasingly used to simulate social media users and infer how individuals may respond to online discussions. However, it remains unclear whether these simulations reflect precise user-specific beliefs or whether they are highly sensitive to semantically independent changes in conversational contexts. In this work, we study counterfactual context revision as a framework for auditing LLM-based stance simulation. Given an original online conversation, we first infer a target user's stance toward a specific topic. We then apply controlled revision strategies to the conversational context and simulate the user's stance again under the revised context. We compare text-only revision strategies with a multimodal one that incorporates meme-based context and evaluate two main effectiveness metrics, i.e., average directional stance shift and stance transition rate. The results reveal effective and robust stance transitions in both text-only and multimodal strategies across different polarization-preference mechanisms. Our study contributes an evaluation framework for understanding the context sensitivity of LLM-based stance simulation. More broadly, it highlights both the promise and risk of using LLMs to simulate online opinion dynamics.

HuggingFace Daily Papers（社区热门论文）

52导出 Markdown

审计基于LLM的在线讨论立场模拟：反事实语境修正框架

2026-06-04 08:00·29天前

阅读原文· arxiv.org

AI 摘要

原文 · 保持原样，未翻译