Rohan Paul@rohanpaul_ai

2026-06-27 05:05·6天前

AI 摘要

对于 GPT 5.6 Sol，高达 750 tokens/sec。当前 GPT-5.5 优先和规模层级服务宣称 99% >50 tokens/sec，因此 Cerebras 上的 Sol 声称达到该速率的 15 倍。这个巨大数字来自专门的推理硬件：Sol 运行在 Cerebras 上，其晶圆级芯片旨在以远少于普通多 GPU 设置的存储和网络延迟来移动模型数据。

A huge 750 tokens/sec for GPT 5.6 Sol.

The current GPT-5.5 priority and scale-tier service advertises 99% >50 tokens/sec， so Sol on Cerebras is claiming up to 15x that rate.

This huge number is coming from the specialized inference hardware： Sol is being served on Cerebras， whose wafer-scale chip is designed to move model data with far less memory and networking delay than a normal multi-GPU setup.