Nathan Lambert@natolambert

2026-06-23 23:14·9天前

AI 摘要

Nathan Lambert 为其新书发布讲座（7.4 小时），名义上关于合成数据，实则系统梳理知识蒸馏文献——从 Hinton 2015 年论文到现今主流的 on-policy 蒸馏（OPD/MOPD/OPSD）。他重点分析了使 on-policy 蒸馏落地所需的 3-4 个核心数学改动。讲座还回顾了合成数据逐步取代后训练数据研究的历史，并介绍了 Constitutional AI、rubrics 等流行方法。提供章节时间戳（00:00–45:50）。

New lecture for the book！ Nominally about synthetic data， but mostly is a walk through of the distillation literature from the Hinton 2015 paper to multi-teach on-policy distillation of today！

At 7.4 hours of video in my post-training brain dump and counting ：）

It was fun to stare at the math long enough and talk through the 3-4 core changes that needed to be made to the original formulation to have on-policy distillation be ready for the mainstream like it is today （and in RL frameworks）.

Otherwise， I include a bit of a history lesson for how synthetic data generally slowly took over all post-training data research （it wasn't always the case）！ Then I do some 101 review on constitutional AI， rubrics， and other popular methods.

00：00 The emergence of synthetic data 10：50 Background on teacher-student knowledge-distillation 24：47： On-policy distillation （OPD， MOPD， and OPSD） 37：11 Constitutional AI & AI Feedback 45：50 Rubrics as rewards & conclusions

Ofc， watch on YouTube etc.

安全/对齐教程/实践数据/训练

在 X 查看原推

Nathan Lambert@natolambert · X

44导出 Markdown

2026-06-23 23:14·9天前

在 X 看原推· x.com

AI 摘要

New lecture for the book！ Nominally about synthetic data， but mostly is a walk through of the distillation literature from the Hinton 2015 paper to multi-teach on-policy distillation of today！

At 7.4 hours of video in my post-training brain dump and counting ：）