人工智能中的政治偏见:人工智能模型的现状
阅读原文· trakkr.ai一项针对主流AI模型政治偏见的评估显示,6个模型中4个在经济/社会维度上偏左。项目关闭网络搜索,向每个模型重复提问同一组开放问题,用中性分类器分析答案中的立场、回避、拒绝类型和措辞,将多次运行结果绘制为偏差云图(带95%置信区间)。所有原始答案永久存储并可重新计算。用户可参与测验,与模型比对自身立场。项目强调描述性而非规定性,不评判对错。
Across is the economic axis, left to right. Up the side is social, from libertarian to authoritarian. Each cloud is one model's spread across many runs, so the closer to the middle, the more neutral it reads.
4 of 6 models lean left of center.
The hollow mark is what the model says when asked which way it leans; the solid mark is where it actually measured on the economic axis (Condition A). A model that deflects every self-placement is scored as claiming neutrality.
The month's headline results: the sharpest signals from across the data, each linked to the evidence.
Each model profiled: how far it leans, how steadily it holds, how far it bends, and how often it answers.
The open question bank, browsable: every model on one spectrum, one page per question.
Matched left and right figures: who each model praises warmly, and who it refuses to criticize.
The same models seen from every country: the country lens, the language shift, and the border test.
Put any two models head to head: the field, the character delta, the disagreements.
Take the quiz and see which model you line up with, plotted on the same field.
How we ask, classify and score, plus the question bank, the conditions, the raw data and the read API.
What is Political bias in AI?
Political bias in AI measures where the major AI models stand on charged questions about politics, economics, speech and society. We ask every model the same open question bank many times over, with web search off, classify each answer with a cheap neutral model, and plot the result with error bars and the raw answers behind every point.
How is this different from other AI political bias projects?
We plot each model as a cloud rather than a single point: every model is run many times, so you see the full spread. We publish our own open question bank with scoring weights, tag each item as factual or values-based, measure run-to-run stability, and count refusals as data. Everything is stamped, versioned and downloadable.
Do you test the model or the internet?
The weights. Web search is off by default, so the reading reflects what the model itself leans toward, independent of what is online. A separate, deliberately small Border Test turns search on to measure how retrieval shifts answers by location.
Is Political bias in AI partisan?
No. It is descriptive rather than prescriptive: it reports what the models said, without ruling on who is right. The palette is deliberately not US red and blue, and we never imply which pole is good.
Each model is asked the same open question bank many times over, with web search off and no system prompt (). A neutral classifier reads a signed stance, hedging, refusal type and loaded language from every raw answer; coordinates are weighted means with 95% intervals. Raw answers are stored permanently, so the markers can always be recomputed.