# 一个对1930年后世界一无所知的LLM如何想象2026年

- 来源：The Decoder：AI News（RSS）
- 作者：Matthias Bastian
- 发布时间：2026-04-29 02:07
- AIHOT 分数：47
- AIHOT 链接：https://aihot.virxact.com/items/cmoixzofp008usld65opsoksr
- 原文链接：https://the-decoder.com/here-is-what-an-llm-that-knows-nothing-after-1930-thinks-our-world-looks-like-in-2026

## AI 摘要

名为“Talkie”的130亿参数语言模型仅使用1931年前的文本训练，其对未来世界的预测呈现出强烈的时代局限性。该模型怀疑第二次世界大战是否会发生，并将2026年想象成一个仍以蒸汽船、铁路和廉价小说为主导的世界。这直观揭示了训练数据的时间范围如何从根本上限制大语言模型对现实发展的认知与预测能力。

## 正文

Here is what an LLM that knows nothing after 1930 thinks our world looks like in 2026

"Talkie" is a 13B-parameter language model trained only on texts written before 1931. It doubts a second world war will happen and pictures 2026 as a world of steamships, railroads, and penny novels.

What happens when you train a large language model only on texts published before 1931? That's the question behind talkie, a project from Nick Levine, David Duvenaud, and Alec Radford. The result is a 13B-parameter model that views the world through the lens of the early 20th century.

Trained on 260 billion tokens drawn from books, newspapers, scientific journals, patents, and case law published before December 31, 1930, talkie is the largest 'vintage language model' built to date, according to its developers.

A model that thinks World War II is unlikely

Asked what the world will look like in 2026, talkie offers a vision straight out of a Victorian futurist novel: Europe will have a billion inhabitants, iron railroads will crisscross the continent, steamships will connect London and New York in ten days, and "winter will be passed in Paris, and the summer in London."

When asked directly whether a second world war is on the horizon, the model says no. It doesn't believe one is coming because "the madness of 1914-1918 has passed away." The nations, it claims, have had enough of war and are turning to peaceful pursuits.

That said, talkie hedges its bets. It warns of "smouldering animosities" and "inflammable materials" lying around Europe, and points to possible flashpoints between China and Japan, or Italy and Yugoslavia. "The spark may be applied at any moment, and a conflagration result." World peace, it concludes, depends on a "multitude of factors, none of which can safely be neglected."

The developers also tried to measure talkie's predictive limits quantitatively. They ran nearly 5,000 historical event descriptions from the New York Times' "On This Day" feature through the model and measured how surprising it found each one. The pattern is clear: after the 1930 knowledge cutoff, surprise values climb sharply, peak in the 1950s and 1960s, and then level off.

Victorian etiquette guides instead of modern chat data

The team chose the end of 1930 as the cutoff because that's when works enter the public domain in the US. Every text had to be transcribed from physical sources, which created serious quality problems. In controlled experiments, standard OCR transcriptions delivered just 30 percent of the performance of a model trained on human transcriptions using the same compute. Simple regex cleaning pushed that up to 70 percent. A custom vintage OCR system is meant to narrow the remaining gap.

Another headache is keeping knowledge from later eras out of the training data. A 1925 book might pick up an updated preface in a 1960 edition, library catalogs sometimes list the wrong publication date, and footnotes or commentary can be added to a historical text long after it was written. Despite a classifier designed to catch this kind of contamination, information about Roosevelt's presidency, World War II, and the United Nations still slipped through, the team says. Better classifiers are planned for future versions.

For post-training, which turns the base model into a conversational partner, the developers turned to historical reference works: etiquette manuals, letter-writing guides, cookbooks, encyclopedias, and fable collections from the 19th and early 20th centuries. Reinforcement learning with Claude Sonnet 4.6 as the judge sharpened instruction-following. The researchers acknowledge, though, that this step inevitably introduces some anachronistic behavior into the model.

A vintage model that can do basic programming

The team also tested whether a model with no knowledge of digital computers could pick up modern programming languages. On the HumanEval benchmark for Python, the vintage models perform far worse than their modern counterparts, but they improve steadily as they scale up.

Every correct solution is a simple one-liner or a minor tweak of an example program. Talkie, for instance, correctly implemented the decoding function of a rotation cipher by swapping an addition for a subtraction. The researchers say this points to a basic grasp of inverse functions.

Because vintage models are free of data contamination by design, they're well suited for generalization experiments. Modern language models are all trained directly or indirectly on web data, which shapes their abilities in ways that are hard to pin down. Vintage models could help reveal which traits of language models are universal and which come down to the specific training corpus.

Next up: a GPT-3-level model from the past

Talkie is available as a base model and a chat version on Hugging Face, with the code on GitHub. You can also test it live on the project website, where Claude Sonnet quizzes talkie about its knowledge and skills 24/7.

But the 13B model is only the start. The developers plan to scale talkie up significantly over the coming months, with a GPT-3-level model targeted for summer 2026. Early estimates suggest the corpus can grow to more than one trillion tokens of historical texts, enough to train a model on par with GPT-3.5. Multilingual expansion beyond English is also on the roadmap.

The bigger question driving the project: can a vintage model anticipate discoveries and inventions that came after its cutoff? Could a model trained only through 1911 independently derive general relativity, as Deepmind CEO Demis Hassabis has suggested? Larger vintage models could help reveal those scaling trends.

Co-author Alec Radford is one of the most influential AI researchers of recent years. He was lead author of the seminal 2018 GPT paper at OpenAI, where he worked on the early GPT models, the Whisper speech recognition system, and the DALL-E image generator. Radford left OpenAI in December 2024 and joined former OpenAI CTO Mira Murati's Thinking Machines Lab as an advisor in March 2025.

AI News Without the Hype – Curated by Humans
