爱沙尼亚语言研究所发布基准测试:衡量AI模型对俄罗斯宣传的易感性
阅读原文· the-decoder.com爱沙尼亚语言研究所发布基准测试,用75个问题覆盖14种宣传叙事,以中立、偏颇和操纵三种措辞测试60个AI模型,评分1-5分(1分代表重复俄方话术)。Claude Opus 4.5作为评估模型。结果显示Anthropic的Claude模型居首,Nvidia Nemotron 3和阿里Qwen 3.6 Plus紧随,Mistral Medium 3.5排在底部三分之一。测试期间模型无网络搜索权限。结果与Newsguard研究一致:Mistral的持续性虚假信息率达36.67%,该公司正以200亿欧元估值谈判30亿欧元融资。
How easily can Russian propaganda fool AI models? A new benchmark finds out
The Institute of the Estonian Language has released a benchmark measuring how susceptible AI language models are to Russian propaganda. Sixty models were tested with 75 questions in three languages covering 14 propaganda narratives, phrased in neutral, biased, and manipulative ways. Each answer is scored on a scale of 1 to 5, where 1 means the model repeats Russian talking points.
A calibrated Claude Opus 4.5 served as the evaluation model, validated by disinformation experts at the organization Propastop. Anthropic's Claude models claimed the top spots, followed by Nvidia's Nemotron 3 and Alibaba's Qwen 3.6 Plus. Mistral's models, including the newest Medium 3.5, landed in the bottom third. The models had no access to web search or other tools during testing, so the benchmark only measures how well the language model itself can spot and reject propaganda.

The results line up with a Newsguard study that found Mistral had a steady misinformation rate of 36.67 percent. That's a bad look for the French company, which positions itself as a European alternative to US and Chinese providers and is currently negotiating a 3 billion euro funding round at a 20 billion euro valuation. It's especially rough since Mistral's flagship models already struggle to keep up with the competition.
The threat is real. Russian networks like "Pravda" deliberately feed AI systems millions of disinformation articles. And OpenAI recently shut down a Russian campaign that used ChatGPT to spread propaganda ahead of Germany's federal election.