Rohan Paul@rohanpaul_ai

2026-04-30 07:56·64天前

AI 摘要

研究发现，当语言模型面对困难问题时，其内部“脑活动”会收缩到更少的路径中。模型在感到困惑时会压缩内部思考，表现为从广泛分散的神经元激活，坍缩为最终处理层中高度集中的信号。这是因为系统放弃了稳健的分布式记忆，将计算强制压缩到狭小的专门空间以应对陌生挑战。关键在于，这种收缩效应可被量化为一个原始数值，从而无需猜测问题对AI是否过难。通过读取此内部信号，便能自动为系统提供恰如其分的“垫脚石”以辅助其解决问题。

Researchers found that when language models face harder questions， their internal brain activity literally shrinks into fewer paths.

Language models actually compress their internal thinking when they get confused， and we can use that to help them.

Standard AI models usually spread their thinking across many artificial neurons when they confidently recognize familiar information.

The team discovered that if you confuse a model with tricky math or conflicting facts， this broad activation collapses into a highly concentrated signal in its final processing layer.

This shrinking happens because the system drops its robust distributed memory and forces the computation into a tiny specialized space to survive the unfamiliar challenge.

The big deal is that we usually have no idea when a language model is actually struggling with a weird prompt until it gives a wrong answer.

This paper proves that the model actually broadcasts its confusion internally by abandoning its wide neural networks and falling back on a very tiny cluster of active neurons.

Because we can measure this exact shrinking effect as a raw number， we do not have to guess if a question is too hard for the AI.

We can just read that internal signal and automatically provide the system with the perfectly scaled stepping stones it needs to solve the problem.

----

Paper Link - arxiv. org/abs/2603.03415

Paper Title： "Farther the Shift， Sparser the Representation： Analyzing OOD Mechanisms in LLMs"

安全/对齐推理论文/研究

在 X 查看原推导出 Markdown

Rohan Paul@rohanpaul_ai · X

43导出 Markdown