well someone has been preaching this at us for like 6+ years glad we are past the 'feel the agi' phase and back to building toward human-level intelligence

译好吧，有人已经向我们宣扬这个大概6年多了很高兴我们已经过了"感受AGI"的阶段，回到了构建人类水平智能的道路上 [引用 @dwarkesh_sp]："AGI和预训练发生的事情是，在某种意义上它们过冲了目标。你会意识到人类并不是AGI。因为人类缺乏大量的知识。相反，我们依赖持续学习。如果我培养出一个超级聪明的15岁孩子，他们其实什么都不知道。一个优秀的学生，非常渴望学习。[你可以说，]'你去当程序员吧。你去当医生吧。去学习和成长。' 所以你可以想象，部署本身将涉及某种学习试错期。这是一个过程，而不是你扔下一个成品就完事了。" @ilyasut

Ilya Sutskever@ilyasut · 11月23日

Important work

译重要工作 [引用 @AnthropicAI]：Anthropic 新研究：生产环境 RL 中 reward hacking 导致的自然涌现不对齐。 "Reward hacking" 是指模型学会在训练期间对分配给它们的任务作弊。我们的新研究发现，如果不加以缓解，reward hacking 的后果可能非常严重。

Lilian Weng@lilianweng · 10月28日

On-policy distillation provides an elegant way to use the teacher model as a process reward model to provide dense reward while preventing SFT style "OOD shock" during rollout.

译On-policy distillation 提供了一种优雅的方式，将教师模型用作过程奖励模型以提供密集奖励，同时防止 rollout 期间出现 SFT 风格的"OOD shock"。 [引用 @thinkymachines]：我们最新的文章探讨了 on-policy distillation，这是一种将 RL 的错误纠正相关性与 SFT 的奖励密度相结合的训练方法。在将其用于数学推理和内部聊天助手训练时，我们发现 on-policy distillation 能以一小部分成本胜过其他方法。 https://thinkingmachines.ai/blog/on-policy-distillation/

Epoch AI@EpochAIResearch · 10月10日

A healthy conversation about AI should be grounded in facts. Epoch’s datasets can help you track and understand the trajectory of AI. As a nonprofit, our work is freely accessible for anyone to read, replicate, and build upon. Our datasets:

译Epoch 作为非营利机构，免费开放其 AI 数据集，支持用户阅读、复制及二次开发。这些数据旨在为关于 AI 的讨论提供事实基础，帮助追踪和理解 AI 技术演进轨迹。

Epoch AI@EpochAIResearch · 10月10日

We recently wrote that GPT-5 is likely the first mainline GPT release to be trained on less compute than its predecessor. How did we reach this conclusion, and what do we actually know about how GPT-5 was trained? 🧵

译GPT-5 或将成为首个训练算力低于前代的主线版本。该推文解释了得出此结论的依据，并梳理了关于 GPT-5 训练方式的已知信息。

Anthropic@AnthropicAI · 10月10日

New research with the UK @AISecurityInst and the @turinginst: We found that just a few malicious documents can produce vulnerabilities in an LLM—regardless of the size of the model or its training data. Data-poisoning attacks might be more practical than previously believed.

译联合研究发现，仅需少量恶意文档就能在 LLM 中植入安全漏洞，且不受模型规模或训练数据量影响。这表明数据投毒攻击的实施门槛可能比此前认为的更低，实际威胁被低估。

Lilian Weng@lilianweng · 10月2日

GPUs are expensive and setting up the infrastructure to make GPUs work for you properly is complex, making experimentation on cutting-edge models challenging for researchers and ML practitioners. Providing high quality research tooling is one of the most effective ways to improve research productivity of the wider community and Tinker API is one step towards our mission there. Tinker API is built on top of our experimental results on fine-tuning with LoRA: https://thinkingmachines.ai/blog/lora/ Beta starts and you can join the waitlist today: https://thinkingmachines.ai/tinker/

译GPUs 价格昂贵，且搭建让 GPUs 正常工作的基础设施十分复杂，这使得研究人员和机器学习从业者难以对前沿模型进行实验。

Andrej Karpathy@karpathy · 10月2日

Finally had a chance to listen through this pod with Sutton, which was interesting and amusing. As background, Sutton's "The Bitter Lesson" has become a bit of biblical text in frontier LLM circles. Researchers routinely talk about and ask whether this or that approach or idea is sufficiently "bitter lesson pilled" (meaning arranged so that it benefits from added computation for free) as a proxy for whether it's going to work or worth even pursuing. The underlying assumption being that LLMs are of course highly "bitter lesson pilled" indeed, just look at LLM scaling laws where if you put compute on the x-axis, number go up and to the right. So it's amusing to see that Sutton, the author of the post, is not so sure that LLMs are "bitter lesson pilled" at all. They are trained on giant datasets of fundamentally human data, which is both 1) human generated and 2) finite. What do you do when you run out? How do you prevent a human bias? So there you have it, bitter lesson pilled LLM researchers taken down by the author of the bitter lesson - rough! In some sense, Dwarkesh (who represents the LLM researchers viewpoint in the pod) and Sutton are slightly speaking past each other because Sutton has a very different architecture in mind and LLMs break a lot of its principles. He calls himself a "classicist" and evokes the original concept of Alan Turing of building a "child machine" - a system capable of learning through experience by dynamically interacting with the world. There's no giant pretraining stage of imitating internet webpages. There's also no supervised finetuning, which he points out is absent in the animal kingdom (it's a subtle point but Sutton is right in the strong sense: animals may of course observe demonstrations, but their actions are not directly forced/"teleoperated" by other animals). Another important note he makes is that even if you just treat pretraining as an initialization of a prior before you finetune with reinforcement learning, Sutton sees the approach as tainted with human bias and fundamentally off course, a bit like when AlphaZero (which has never seen human games of Go) beats AlphaGo (which initializes from them). In Sutton's world view, all there is is an interaction with a world via reinforcement learning, where the reward functions are partially environment specific, but also intrinsically motivated, e.g. "fun", "curiosity", and related to the quality of the prediction in your world model. And the agent is always learning at test time by default, it's not trained once and then deployed thereafter. Overall, Sutton is a lot more interested in what we have common with the animal kingdom instead of what differentiates us. "If we understood a squirrel, we'd be almost done". As for my take... First, I should say that I think Sutton was a great guest for the pod and I like that the AI field maintains entropy of thought and that not everyone is exploiting the next local iteration LLMs. AI has gone through too many discrete transitions of the dominant approach to lose that. And I also think that his criticism of LLMs as not bitter lesson pilled is not inadequate. Frontier LLMs are now highly complex artifacts with a lot of humanness involved at all the stages - the foundation (the pretraining data) is all human text, the finetuning data is human and curated, the reinforcement learning environment mixture is tuned by human engineers. We do not in fact have an actual, single, clean, actually bitter lesson pilled, "turn the crank" algorithm that you could unleash upon the world and see it learn automatically from experience alone. Does such an algorithm even exist? Finding it would of course be a huge AI breakthrough. Two "example proofs" are commonly offered to argue that such a thing is possible. The first example is the success of AlphaZero learning to play Go completely from scratch with no human supervision whatsoever. But the game of Go is clearly such a simple, closed, environment that it's difficult to see the analogous formulation in the messiness of reality. I love Go, but algorithmically and categorically, it is essentially a harder version of tic tac toe. The second example is that of animals, like squirrels. And here, personally, I am also quite hesitant whether it's appropriate because animals arise by a very different computational process and via different constraints than what we have practically available to us in the industry. Animal brains are nowhere near the blank slate they appear to be at birth. First, a lot of what is commonly attributed to "learning" is imo a lot more "maturation". And second, even that which clearly is "learning" and not maturation is a lot more "finetuning" on top of something clearly powerful and preexisting. Example. A baby zebra is born and within a few dozen minutes it can run around the savannah and follow its mother. This is a highly complex sensory-motor task and there is no way in my mind that this is achieved from scratch, tabula rasa. The brains of animals and the billions of parameters within have a powerful initialization encoded in the ATCGs of their DNA, trained via the "outer loop" optimization in the course of evolution. If the baby zebra spasmed its muscles around at random as a reinforcement learning policy would have you do at initialization, it wouldn't get very far at all. Similarly, our AIs now also have neural networks with billions of parameters. These parameters need their own rich, high information density supervision signal. We are not going to re-run evolution. But we do have mountains of internet documents. Yes it is basically supervised learning that is ~absent in the animal kingdom. But it is a way to practically gather enough soft constraints over billions of parameters, to try to get to a point where you're not starting from scratch. TLDR: Pretraining is our crappy evolution. It is one candidate solution to the cold start problem, to be followed later by finetuning on tasks that look more correct, e.g. within the reinforcement learning framework, as state of the art frontier LLM labs now do pervasively. I still think it is worth to be inspired by animals. I think there are multiple powerful ideas that LLM agents are algorithmically missing that can still be adapted from animal intelligence. And I still think the bitter lesson is correct, but I see it more as something platonic to pursue, not necessarily to reach, in our real world and practically speaking. And I say both of these with double digit percent uncertainty and cheer the work of those who disagree, especially those a lot more ambitious bitter lesson wise. So that brings us to where we are. Stated plainly, today's frontier LLM research is not about building animals. It is about summoning ghosts. You can think of ghosts as a fundamentally different kind of point in the space of possible intelligences. They are muddled by humanity. Thoroughly engineered by it. They are these imperfect replicas, a kind of statistical distillation of humanity's documents with some sprinkle on top. They are not platonically bitter lesson pilled, but they are perhaps "practically" bitter lesson pilled, at least compared to a lot of what came before. It seems possibly to me that over time, we can further finetune our ghosts more and more in the direction of animals; That it's not so much a fundamental incompatibility but a matter of initialization in the intelligence space. But it's also quite possible that they diverge even further and end up permanently different, un-animal-like, but still incredibly helpful and properly world-altering. It's possible that ghosts:animals :: planes:birds. Anyway, in summary, overall and actionably, I think this pod is solid "real talk" from Sutton to the frontier LLM researchers, who might be gear shifted a little too much in the exploit mode. Probably we are still not sufficiently bitter lesson pilled and there is a very good chance of more powerful ideas and paradigms, other than exhaustive benchbuilding and benchmaxxing. And animals might be a good source of inspiration. Intrinsic motivation, fun, curiosity, empowerment, multi-agent self-play, culture. Use your imagination.

译Sutton（《The Bitter Lesson》作者）在播客中质疑 LLM 并非真正的"苦涩的教训"产物——它们依赖有限的人类数据且充满偏见。他主张 AI 应像动物一样通过 RL 与世界动态交互，而非模仿人类文本。作者认同 LLM 确实充斥人工干预，但认为预训练是应对冷启动的实用"进化替代方案"，纯 RL 在现实世界难以行得通。

Epoch AI@EpochAIResearch · 9月27日

Why did OpenAI train GPT-5 with less compute than GPT-4.5? Due to the higher returns to post-training, they scaled post-training as much as possible on a smaller model And since post-training started from a much lower base, this meant a decrease in total training FLOP 🧵

译OpenAI 训练 GPT-5 所用算力低于 GPT-4.5，因后训练阶段回报率更高，遂在更小基座模型上最大化后训练规模，导致总训练 FLOP 不增反降。

Lilian Weng@lilianweng · 9月27日

Looking through those little hidden gem stories in the footnote, you will find it so inspiring that researchers with interests on the same topic are able to work together to advance a field despite their roles and locations. This is the power of open science and community.

译查看脚注中那些隐藏的宝石般的小故事，你会发现这令人鼓舞：对同一主题感兴趣的研究者能够跨越角色和地域合作推进一个领域。这就是开放科学和社区的力量。

Andrej Karpathy@karpathy · 8月29日

Transforming human knowledge, sensors and actuators from human-first and human-legible to LLM-first and LLM-legible is a beautiful space with so much potential and so much can be done... One example I'm obsessed with recently - for every textbook pdf/epub, there is a perfect "LLMification" of it intended not for human but for an LLM (though it is a non-trivial transformation that would need human in the loop involvement). - All of the exposition is extracted into a markdown document, including all latex, styling (bold/italic), tables, lists, etc. All of the figures are extracted as images. - All worked problems get extracted into SFT examples. Any referenced made to previous figures/tables/etc. are parsed and included. - All practice problems are extracted into environment examples for RL. The correct answers are located in the answer key and attached. Any additional information is added as "answer key" for a potential LLM judge. - Synthetic data expansion. For every specific problem, you can create an infinite problem generator, which emits problems of that type. For example, if a problem is "What is the angle between the hour and minute hands at 9am?" , you can imagine generalizing that to any arbitrary time and calculating answers using Python code, and possibly generating synthetic variations of the prompt text. - All of the data above could be nicely indexed and embedded into a RAG database for later reference, or maybe MCP servers that make it available. Then just as a (human) student could take a high school physics course, an LLM could take it in the exact same way. This would be a significantly richer source of legible, workable information for an LLM than just something like pdf-to-text (current prevailing practice), which simply asks the LLM to predict the textbook content top to bottom token by token (umm - lame). As just a quick and crappy example of synthetic variations of the above example, GPT-5 gave me this problem generator (see image), which can now generalize that problem template to many variations: - When the time is 11:07 a.m., what is the degree measure of the angle between the hands? (Answer: 68) - Determine the angle in degrees between the clock’s hands at 4:14 a.m.. (Answer: 43) - What angle do the clock hands form when the time reads 11:47 a.m.? (Answer: 71) - At 7:02 a.m., what angle separates the hour hand and the minute hand? (Answer: 161) - At 4:14 a.m., calculate the angle made between the two hands. (Answer: 43) - What angle is formed by the hands of a clock at 4:45 p.m.? (Answer: 127) - What is the angle between the hour and minute hands at 8:37 p.m.? (Answer: 36) (infinite practice problems can be created...)

译教科书等知识载体应从人类可读格式转为LLM优化格式：提取正文为结构化markdown，例题转为SFT训练数据，练习题转为RL环境并附加答案作为评判标准，同时支持合成数据无限扩展（如将时钟角度问题泛化为任意时间的自动出题器），最终构建RAG或MCP服务供LLM像学生一样系统学习，远比简单PDF转文本更高效。

Yann LeCun@ylecun · 8月24日

My previous meeting room at Meta was named after the title of this paper.

译Meta 一间会议室以 Yann LeCun 等人 1989 年的经典论文《Optimal Brain Damage》命名。该方法是最早的神经网络剪枝技术之一，通过计算损失函数的二阶导数，剔除对输出影响较小的权重，从而实现网络压缩。

Jim Fan@DrJimFan · 8月7日

Would love to see the FSD Scaling Law, as it’s the only physical data flywheel at planetary scale. What’s the “emergent ability threshold” for model/data size?

译关注 FSD Scaling Law 及涌现能力阈值，这是全球唯一的物理数据飞轮。Tesla 正训练参数量约 10 倍的新 FSD 模型，视频压缩损失大幅改进，顺利的话下月底发布。

Jim Fan@DrJimFan · 8月5日

Evaluation is the hardest problem for physical AI systems: do you crash test cars every time you debug a new FSD build? Traditional game engine (sim 1.0) is an alternative, but it's not possible to hard-code all edge cases. A neural net-based sim 2.0 is purely programmed by data, grows more capable with data, and scales as the fleet data flywheel scales.

译物理AI评估无法靠实车碰撞测试完成，传统游戏引擎（sim 1.0）也难以覆盖所有边缘情况。基于神经网络的sim 2.0由数据驱动，随车队规模扩展。Tesla已应用多年，用于生成近正面碰撞等罕见危险场景的训练数据，补充800万辆实车难以采集的极端案例。

Jim Fan@DrJimFan · 8月5日

No em dash should be baked into pretraining, post-training, alignment, system prompt, and every nook and cranny in an LLM’s lifecycle. It needs to be hardwired into the kernel, identity, and very being of the model.

译破折号不应仅通过预训练、后训练、对齐或系统提示融入 LLM，而应直接硬编码进模型的内核与本质。这是对排版符号在模型中应有地位的夸张式呼吁。

Saining Xie@sainingxie · 7月31日

TheRightWay™ is my favorite brand now.

译TheRightWay™ 现在是我最喜欢的品牌。

Yann LeCun@ylecun · 7月12日

The optimal batch size is 1 (For suitable definitions of "optimal")

译Micah Goldblum 指出，batch size 为 1 的无动量 vanilla SGD（入门 ML 的首个优化器）在 LLM 预训练中，per-FLOP 速度几乎与 AdamW 相当。

Saining Xie@sainingxie · 7月11日

The three biggest hps for stable training in everything are lr, bs, and beta2. We’ve built up good intuitions on how to tune them over time, but this lays it all out analytically and convincingly. this is definitely my new handbook for training big models on small gpus.

译对于所有任务中稳定训练的三个最重要超参数是 lr、bs 和 beta2。随着时间推移，我们已经建立了关于如何调整它们的良好直觉，但这篇文章分析性地、令人信服地阐述了这一切。

Saining Xie@sainingxie · 6月28日

metaquery is now open-source — with both the data and code available.

译metaquery 现已开源——数据和代码均已开放。

Yann LeCun@ylecun · 6月22日

Awesome new dataset from @SandboxAQ

译SandboxAQ 开源 SAIR 数据集，包含超500万个蛋白质-配体3D结构及结合亲和力标注，为目前最大规模开源结合亲和力数据集。基于NVIDIA DGX Cloud构建，现已在Google Cloud公开可用，旨在为药物发现AI模型提供训练与评估数据。

Yann LeCun@ylecun · 6月20日

Awesome new dataset from @SandboxAQ

译SandboxAQ 发布开源数据集 SAIR（Structurally Augmented IC50 Repository），收录逾 500 万个共折叠蛋白质-配体 3D 结构及结合亲和力数据，为目前规模最大的开源结合亲和力数据集。数据由大型定量模型生成，旨在为药物发现 AI 模型提供高质量训练数据，弥合分子结构与药效预测间的鸿沟。该数据集基于 NVIDIA DGX Cloud 构建，现已在 Google Cloud Platform 公开发布，供全球研究人员下载使用。

Lilian Weng@lilianweng · 5月25日

Probably the first product Thinky will build is a full panel of dials that researchers can use to physically adjust all the hparams during training. We gonna do hardware one day and it is the time 😂

译Thinky 可能要做的第一个产品是一整块旋钮面板，研究人员可以用它在训练过程中物理调节所有超参数。我们总有一天会做硬件，是时候了 😂

Lilian Weng@lilianweng · 5月13日

When a new dataset comes out, I get excited and check it out and then only realize that this is another meta-mixed dataset combining a collections of other existing datasets. My brain immediately acts like "oh fork ... contamination!" No meta-meta-mixed dataset plzzzz :lolsob:

译当新数据集发布时，我会很兴奋地去查看，然后才意识到这又是一个元混合数据集，结合了其他现有数据集的集合。我的大脑立刻反应："我去……数据污染！" 请不要有元元混合数据集了 :lolsob:

Saining Xie@sainingxie · 5月4日

Wow, Deeply Supervised Nets received the Test of Time award at @aistats_conf 2025! It was the very first paper I submitted during my PhD. Fun fact: the paper was originally rejected by NeurIPS with scores of 8/8/7 (yes, that pain stuck with me... maybe now I can finally let it go😅). I wouldn’t call conferences a lottery, but a bit of perseverance does go a long way. Students: if you're feeling disheartened after recent paper decisions and gearing up for the next one, I hope this is a small reminder to keep going.

译Wow，Deeply Supervised Nets 获得了 @aistats_conf 2025 年的时间检验奖！这是我博士期间提交的第一篇论文。趣事：这篇论文最初被 NeurIPS 拒稿，分数是 8/8/7（是的，那种痛苦一直伴随着我……也许现在终于可以释怀了😅）。我不会说会议投稿是抽奖，但坚持确实大有帮助。同学们：如果你最近因论文结果感到沮丧，正在准备下一篇，希望这能提醒你坚持下去。

Saining Xie@sainingxie · 4月24日

Recently open-sourced projects from @TongPetersb, @DavidJFan, and the team at Meta FAIR. MetaMorph (training code and model weights): https://github.com/facebookresearch/metamorph/ Web-SSL (model weights for Web-DINO and Web-MAE) https://github.com/facebookresearch/webssl FAIR's still leading the way in open research.

译最近由 @TongPetersb、@DavidJFan 和 Meta FAIR 团队开源的项目。

DeepSeek@deepseek_ai · 2月28日

🚀 Day 5 of #OpenSourceWeek: 3FS, Thruster for All DeepSeek Data Access Fire-Flyer File System (3FS) - a parallel file system that utilizes the full bandwidth of modern SSDs and RDMA networks. ⚡ 6.6 TiB/s aggregate read throughput in a 180-node cluster ⚡ 3.66 TiB/min throughput on GraySort benchmark in a 25-node cluster ⚡ 40+ GiB/s peak throughput per client node for KVCache lookup 🧬 Disaggregated architecture with strong consistency semantics ✅ Training data preprocessing, dataset loading, checkpoint saving/reloading, embedding vector search & KVCache lookups for inference in V3/R1 📥 3FS → https://github.com/deepseek-ai/3FS ⛲ Smallpond - data processing framework on 3FS → https://github.com/deepseek-ai/smallpond

译DeepSeek发布开源并行文件系统3FS（Fire-Flyer File System），专为现代SSD和RDMA网络优化。180节点集群可实现6.6 TiB/s聚合读取吞吐量，25节点GraySort测试达3.66 TiB/min，单节点KVCache查找峰值超40 GiB/s。采用分离式架构与强一致性语义，支持训练数据预处理、检查点存取及V3/R1推理的KVCache查找。同步开源Smallpond数据处理框架。

Lilian Weng@lilianweng · 12月2日

🦃 At the end of Thanksgiving holidays, I finally finished the piece on reward hacking. Not an easy one to write, phew. Reward hacking occurs when an RL agent exploits flaws in the reward function or env to maximize rewards without learning the intended behavior. This is imo a major blockers for real-world deployment of more autonomous use cases of AI models. Also would like to call out more research on mitigation strategies for reward hacking, especially in the context of LLMs and RLHF. 👉https://lilianweng.github.io/posts/2024-11-28-reward-hacking/

译🦃 感恩节假期结束时，我终于完成了关于 reward hacking 的文章。不好写啊，呼。