Anthropic Fellows 新研究:开发 Automated Alignment Researcher。 我们进行了一项实验,以验证 Claude Opus 4.6 能否加速一个关键对齐问题的研究:使用较弱的 AI 模型监督训练更强的模型。 https://www.anthropic.com/research/automated-alignment-researchers
New Anthropic Fellows research: developing an Automated Alignment Researcher.
We ran an experiment to learn whether Claude Opus 4.6 could accelerate research on a key alignment problem: using a weak AI model to supervise the training of a stronger one.
https://www.anthropic.com/research/automated-alignment-researchers