大型语言模型经济因果推理中的意识形态偏见
用一万多个经济学因果三元组测了 20 个模型,发现 LLM 在意识形态争议问题上系统性偏向干预主义立场,做政策分析或经济报告的人该认真想想怎么兜底。
研究通过扩展EconCausal基准,引入1,056个意识形态争议案例,系统评估了20个先进大型语言模型。这些案例源于10,490个经实证验证的因果三元组,涉及干预导向与市场导向观点的分歧。结果显示,争议问题的准确率普遍较低,且在18个模型中,当实证因果方向符合干预导向预期时,模型准确率显著更高。模型的错误预测也明显偏向干预导向,且单样本提示未能消除此倾向。这表明LLMs在意识形态争议经济问题上不仅准确性下降,而且在一个方向上系统性更不可靠,凸显了在高风险经济政策场景中进行方向感知评估的必要性。
Computer Science > Artificial Intelligence
Title:Ideological Bias in LLMs' Economic Causal Reasoning
View PDF HTML (experimental)Abstract:Do large language models (LLMs) exhibit systematic ideological bias when reasoning about economic causal effects? As LLMs are increasingly used in policy analysis and economic reporting, where directionally correct causal judgments are essential, this question has direct practical stakes. We present a systematic evaluation by extending the EconCausal benchmark with ideology-contested cases - instances where intervention-oriented (pro-government) and market-oriented (pro-market) perspectives predict divergent causal signs. From 10,490 causal triplets (treatment-outcome pairs with empirically verified effect directions) derived from top-tier economics and finance journals, we identify 1,056 ideology-contested instances and evaluate 20 state-of-the-art LLMs on their ability to predict empirically supported causal directions. We find that ideology-contested items are consistently harder than non-contested ones, and that across 18 of 20 models, accuracy is systematically higher when the empirically verified causal sign aligns with intervention-oriented expectations than with market-oriented ones. Moreover, when models err, their incorrect predictions disproportionately lean intervention-oriented, and this directional skew is not eliminated by one-shot in-context prompting. These results highlight that LLMs are not only less accurate on ideologically contested economic questions, but systematically less reliable in one ideological direction than the other, underscoring the need for direction-aware evaluation in high-stakes economic and policy settings.
| Subjects: | Artificial Intelligence (cs.AI); Computational Engineering, Finance, and Science (cs.CE); Computation and Language (cs.CL); Machine Learning (cs.LG); General Economics (econ.GN) |
| Cite as: | arXiv:2604.21334 [cs.AI] |
| (or arXiv:2604.21334v1 [cs.AI] for this version) | |
| https://doi.org/10.48550/arXiv.2604.21334 arXiv-issued DOI via DataCite |
Current browse context:
References & Citations
Bibliographic and Citation Tools
Code, Data and Media Associated with this Article
Demos
Recommenders and Search Tools
- Author
- Venue
- Institution
- Topic
arXivLabs: experimental projects with community collaborators
arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.
Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.
Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.