Nathan Lambert@natolambert

2026-07-01 07:24·1天前

AI 摘要

很高兴宣布 @zafstojano —— 一位新增的维护者，他帮助我维护 RLHF Book 代码 —— 向代码库添加了一个简单的在线策略自蒸馏示例，可在一些玩具问题上运行。期待进一步探索，很高兴看到仓库不断完善！

Happy to say @zafstojano - an added maintainer who helps me with the RLHF Book code - added a simple on-policy self-distillation example to the codebase， which can work on some toy problems.

Excited to dig into this more， happy to see the repo fleshed out！

安全/对齐开源/仓库数据/训练

在 X 查看原推导出 Markdown

Nathan Lambert@natolambert · X

48导出 Markdown

2026-07-01 07:24·1天前

在 X 看原推· x.com

AI 摘要

Happy to say @zafstojano - an added maintainer who helps me with the RLHF Book code - added a simple on-policy self-distillation example to the codebase， which can work on some toy problems.

Excited to dig into this more， happy to see the repo fleshed out！

安全/对齐开源/仓库