随着AI承担更长时间、更高风险的任务,我们希望模型能将有益且安全的行为带入训练之外的新领域——并在压力下保持这种行为。这正是我们关于训练模型实现广泛且持久有益的新研究背后的理念。https://alignment.openai.com/beneficial-rl/
As AI takes on longer, higher-stakes tasks, we want models to carry beneficial and safe behavior into new domains beyond their training-and maintain it under pressure.
That's the idea behind our new research on training models to be broadly and persistently beneficial. https://alignment.openai.com/beneficial-rl/