AI 摘要
很高兴宣布 @zafstojano —— 一位新增的维护者,他帮助我维护 RLHF Book 代码 —— 向代码库添加了一个简单的在线策略自蒸馏示例,可在一些玩具问题上运行。期待进一步探索,很高兴看到仓库不断完善!
Happy to say @zafstojano - an added maintainer who helps me with the RLHF Book code - added a simple on-policy self-distillation example to the codebase, which can work on some toy problems.
Excited to dig into this more, happy to see the repo fleshed out!