AI 摘要
主推文指出研究SFT方法的人仍然不足,尽管它是后训练的关键基础且实证文献有限。引用推文介绍了一项系统性研究:团队针对大量客户模型,在dense和MoE两类模型族(参数量达235B)上,每次只变动一个SFT杠杆,使用4个真实客户数据集,每个数据集配有与客户合作数周构建的评估,且训练输出直接为通过该评估生成,从而使监督目标与下游度量标准一致,消除了常见混淆因素。该研究旨在为微调提炼最佳实践。
Not enough people studying SFT methods. It's a foundation of post training with limited literature that seems very serious in an empirical sense.
1/ We fine-tune a lot of customer models, so we decided to systematically try and figure out some best practices for finetuning. SFT isn't sexy, but it's still ...