哈佛、斯坦福、UC伯克利等顶尖实验室联合提出,深度学习正从经验优化转向可解释的科学理论。尽管神经网络架构、数据等完全公开,但其复杂互动使得预测训练过程仍依赖大量实验。作者倡导建立“学习力学”,类似物理学关注宏观规律,通过可解玩具模型、无限宽度极限、缩放定律等五种路径,揭示训练动态与性能演化的整体性法则。这一理论与专注于局部电路的机制可解释性研究形成互补,共同探索学习的全局定律。
Beautiful new paper from Harvard, Stanford, UC Berkeley and other top labs.
Shows that DeepLearning is finally becoming the kind of thing science can explain, not just optimize.
Because we still do not have a compact, predictive theory that tells us ahead of time how a neural network will learn, scale, and respond to training choices without mostly testing it first.
Not that we will soon explain every weight, but that we may learn the coarse laws governing training, representation, and performance.
That shift matters because neural nets are not hidden systems. We know the architecture, the data, the objective, and the update rule. The obstacle is not secrecy. It is the complexity of many simple parts interacting at once.