# Adding an on policy distillation section to the RLHF book and it's remarkable how bad LLMs / coding …

- 来源：Nathan Lambert (@natolambert)
- 发布时间：2026-05-06 07:28
- AIHOT 分数：43
- AIHOT 链接：https://aihot.virxact.com/items/cmot9ly2r03h8slv7suru74yg
- 原文链接：https://x.com/natolambert/status/2051806169182916857

## AI 摘要

正在为RLHF书籍添加一个关于策略蒸馏的章节，值得注意的是，尽管我已经提供了核心论文和250页关于我如何阐述观点的背景资料，但LLMs/编码代理在这方面的表现却出奇地差。

## 正文

Adding an on policy distillation section to the RLHF book and it's remarkable how bad LLMs / coding agents are at it， despite me giving them the core papers and 250 pages of context on how I present ideas.
