大型语言模型经济因果推理中的意识形态偏见

2026-04-25 12:00·57天前·Donggyu Lee, Hyeok Yun, Jungwon Kim, Junsik Min, Sungwon Park, Sangyoon Park, Jihee Kim

精选理由

用一万多个经济学因果三元组测了 20 个模型，发现 LLM 在意识形态争议问题上系统性偏向干预主义立场，做政策分析或经济报告的人该认真想想怎么兜底。

AI 摘要

研究通过扩展EconCausal基准，引入1,056个意识形态争议案例，系统评估了20个先进大型语言模型。这些案例源于10,490个经实证验证的因果三元组，涉及干预导向与市场导向观点的分歧。结果显示，争议问题的准确率普遍较低，且在18个模型中，当实证因果方向符合干预导向预期时，模型准确率显著更高。模型的错误预测也明显偏向干预导向，且单样本提示未能消除此倾向。这表明LLMs在意识形态争议经济问题上不仅准确性下降，而且在一个方向上系统性更不可靠，凸显了在高风险经济政策场景中进行方向感知评估的必要性。

原文 · 未翻译

Computer Science > Artificial Intelligence

[Submitted on 23 Apr 2026]

Title:Ideological Bias in LLMs' Economic Causal Reasoning

Authors:Donggyu Lee, Hyeok Yun, Jungwon Kim, Junsik Min, Sungwon Park, Sangyoon Park, Jihee Kim

View PDF HTML (experimental)

Abstract:Do large language models (LLMs) exhibit systematic ideological bias when reasoning about economic causal effects? As LLMs are increasingly used in policy analysis and economic reporting, where directionally correct causal judgments are essential, this question has direct practical stakes. We present a systematic evaluation by extending the EconCausal benchmark with ideology-contested cases - instances where intervention-oriented (pro-government) and market-oriented (pro-market) perspectives predict divergent causal signs. From 10,490 causal triplets (treatment-outcome pairs with empirically verified effect directions) derived from top-tier economics and finance journals, we identify 1,056 ideology-contested instances and evaluate 20 state-of-the-art LLMs on their ability to predict empirically supported causal directions. We find that ideology-contested items are consistently harder than non-contested ones, and that across 18 of 20 models, accuracy is systematically higher when the empirically verified causal sign aligns with intervention-oriented expectations than with market-oriented ones. Moreover, when models err, their incorrect predictions disproportionately lean intervention-oriented, and this directional skew is not eliminated by one-shot in-context prompting. These results highlight that LLMs are not only less accurate on ideologically contested economic questions, but systematically less reliable in one ideological direction than the other, underscoring the need for direction-aware evaluation in high-stakes economic and policy settings.

Subjects:	Artificial Intelligence (cs.AI); Computational Engineering, Finance, and Science (cs.CE); Computation and Language (cs.CL); Machine Learning (cs.LG); General Economics (econ.GN)
Cite as:	arXiv:2604.21334 [cs.AI]
	(or arXiv:2604.21334v1 [cs.AI] for this version)
	https://doi.org/10.48550/arXiv.2604.21334 arXiv-issued DOI via DataCite

Submission history

From: Donggyu Lee [view email]
[v1] Thu, 23 Apr 2026 06:45:36 UTC (703 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.AI

< prev | next >

new | recent | 2026-04

Change to browse by:

cs
cs.CE
cs.CL
cs.LG
econ
econ.GN
q-fin
q-fin.EC

References & Citations

Bookmark

Bibliographic and Citation Tools

Bibliographic Explorer (What is the Explorer?)

Connected Papers (What is Connected Papers?)

Litmaps (What is Litmaps?)

scite Smart Citations (What are Smart Citations?)

Code, Data and Media Associated with this Article

alphaXiv (What is alphaXiv?)

CatalyzeX Code Finder for Papers (What is CatalyzeX?)

DagsHub (What is DagsHub?)

Gotit.pub (What is GotitPub?)

Hugging Face (What is Huggingface?)

ScienceCast (What is ScienceCast?)

Demos

Replicate (What is Replicate?)

Hugging Face Spaces (What is Spaces?)

TXYZ.AI (What is TXYZ.AI?)

Recommenders and Search Tools

Influence Flower (What are Influence Flowers?)

CORE Recommender (What is CORE?)

Author
Venue
Institution
Topic

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

安全/对齐现象/趋势论文/研究

arXiv：cs.AI（全量分类）

精选62