Inworld、ElevenLabs 与 MiniMax 继续领跑 TTS 排行榜,今年发布的模型包揽前五中的四席。当前领先模型在简单文本上逼真度显著提升,用户偏好差异主要体现在声音风格选择上。评估方法已加强机器人投票过滤,并新增基于95%置信区间的排名范围。具体指标方面,Inworld TTS 1.5 Max 以1,238 Elo分居首,Kokoro 82M v1.0以$0.65/百万字符成为价格最低选项,WaveNet则以每秒419字符领先批处理速度。
Inworld, ElevenLabs, and MiniMax continue to lead our Text to Speech leaderboard for most preferred models
Recent checkpoints from each of the labs continue to push the frontier of TTS quality, with 4 out of the top 5 models being released this year. Leading TTS models are increasingly realistic, particularly on relatively straightforward text, with preference differences increasingly coming down to affinity for different voices.
Latest results also reflect stronger bot vote filtering, confirmed via triangulation against third-party evaluators. We've also added rank ranges based on each model's 95% confidence interval, showing where a model could land based on its Elo score range.