微软研究员用《帝国时代II》山羊神经网络批评AI拟人化
阅读原文· the-decoder.com微软与约克大学研究员Adrian de Wynter在《帝国时代II》地图编辑器中用山羊搭建神经网络:山羊在草地代表0,在桥上代表1,构建XNOR门和AND门,学习逻辑与函数。附录证明该游戏理论上可模拟任意计算机。他批评AI研究拟人化倾向,分析2024年中至2026年中315篇论文,发现57%前提假设大语言模型具有人类特质,36%结论支持拟人化。Anthropic公开承认训练Claude使用“我相信”等措辞。他提出“观察而非归因”方法,并公开代码。
Microsoft researcher builds a working neural network out of goats in Age of Empires II to critique AI science
Adrian de Wynter, a researcher at Microsoft and the University of York, has built a working neural network inside the map editor of the legendary strategy game Age of Empires II. It sounds like a joke, but it's actually a serious critique of the methods used in much of the AI research on language models.
The design is completely absurd. Goats act as bits: a goat standing on grass equals 0, a goat standing on a bridge equals 1. De Wynter builds the logic gates using the scenario editor's scripting tools, and ice ramps with waiting goats keep the calculations from getting jumbled. The finished mini-network consists of two XNOR gates and one AND gate. It learns the logical AND function.

In the appendix, de Wynter goes further. He shows that, in theory, any computer could be replicated using an idealized version of the game, meaning the game is as powerful as a full-fledged computer.
What makes this possible is a quirk of the game's mechanics. The in-game market lets you trade resources for gold, and the price caps at 9,999. According to the paper, this allows for a perpetually running economic cycle where buildings serve as memory cells and active farms represent the current computational state.

Greater Boston as a language model
If you can rebuild a language model in Age of Empires II, de Wynter argues, you could do the same with Lego bricks. Or with the 667,000 people living in Greater Boston, texting each other computational steps on their phones.
The answers would be the same as those from the replicated language model. De Wynter uses this thought experiment to show how shaky these attributions really are: would anyone claim that Boston as a city feels empathy or fear just because its residents happen to be running the math behind a language model?
That's the whole point. How human a chatbot feels comes down to packaging: low latency, smooth language, a chat window people are used to. Replace that wrapper with goats wandering through a maze, and the inputs and outputs don't change. The sense that you're talking to someone does.
De Wynter doesn't claim to know whether a model actually has such traits internally. He's saying LLMs aren't special. They're one way to run a particular kind of math, and they just happen to look like something people want to talk to.
More than half of the papers examined make this mistake
To show this isn't a fringe issue, de Wynter analyzed 315 AI papers from mid-2024 to mid-2026, collected through Semantic Scholar and arXiv and filtered using GPT-5.2. According to the analysis, 57 percent of the papers already assumed in their premises that LLMs have human-like traits. 36 percent reached matching conclusions. Among the 47 papers that made such traits their actual research subject, 77 percent concluded in favor of anthropomorphic attributes.

The core of the criticism is formal. If a researcher assumes a model has fear, morality, or self-awareness - and then designs an experiment meant to prove exactly that trait - the reasoning is circular. The assumption and the result land on the same logical point.
If the experiment comes back negative, it's impossible to tell whether the assumption was wrong, the experiment was flawed, or both. Either way, the result doesn't confirm the starting assumption. It's just ambiguous.

This often happens without anyone noticing. A paper that sets out to disprove a model's ability to explain itself already assumes there's an explainable self inside the model to begin with.
The industry actively feeds this effect. Anthropic has said openly that it trained Claude to use phrases like "I believe" or "I am interested in." De Wynter flags the risks of this kind of anthropomorphization: it can foster emotional attachment, sycophancy, reinforced delusions, and risky behavior. In isolated cases, suicides have been linked to chatbot interactions.
Observe, Do Not Attribute
De Wynter proposes a sober approach: stick to what you can actually observe. Under condition X, the model produces output Y, and don't claim a model understands itself. Statements like that are testable. They don't, on their own, justify sweeping attributions like self-awareness, understanding, or fear.
He closes with an updated version of Morgan's canon from 19th-century animal research. A machine's behavior should never be explained by higher cognitive processes when a simpler explanation works. De Wynter has made the code for the Age of Empires build publicly available.
The essay reads like the exact counterpoint to two high-profile cases from recent years. In 2022, Google engineer Blake Lemoine went public claiming that the language model LaMDA had reached a form of consciousness after he exchanged thousands of messages with it. Google fired him shortly after and, following a thorough review, called his claims unfounded. Then in May 2026, Richard Dawkins - of all people, known as a fierce critic of religious and supernatural thinking - caused a stir with a similar conclusion. He said he'd spent three days trying to convince himself that Anthropic's Claude wasn't conscious. He couldn't.