Epoch AI@EpochAIResearch

2025-09-27 04:35·279天前

AI 摘要

OpenAI 训练 GPT-5 所用算力低于 GPT-4.5，因后训练阶段回报率更高，遂在更小基座模型上最大化后训练规模，导致总训练 FLOP 不增反降。

Why did OpenAI train GPT-5 with less compute than GPT-4.5？

Due to the higher returns to post-training， they scaled post-training as much as possible on a smaller model

And since post-training started from a much lower base， this meant a decrease in total training FLOP 🧵

Epoch AI@EpochAIResearch · X

2025-09-27 04:35·279天前

AI 摘要

OpenAI 训练 GPT-5 所用算力低于 GPT-4.5，因后训练阶段回报率更高，遂在更小基座模型上最大化后训练规模，导致总训练 FLOP 不增反降。

Why did OpenAI train GPT-5 with less compute than GPT-4.5？

Due to the higher returns to post-training， they scaled post-training as much as possible on a smaller model

And since post-training started from a much lower base， this meant a decrease in total training FLOP 🧵