Cartesia公司最新发布的语音合成模型Sonic-3.5在Artificial Analysis Speech Arena排行榜上位居第一,超越了Inworld Realtime TTS 1.5 Max和Google Gemini 3.1 Flash TTS等竞品。该模型支持42种语言(包括9种印度语言),提供超过500种声音选择。评测数据显示,Sonic-3.5以1,218的Elo分数领先,表现出自然的语音效果和准确的文本跟随能力。其定价为每百万字符39美元,高于竞品;生成速度为每秒105.5字符,介于其他两者之间。
Cartesia's Sonic-3.5 takes the #1 spot on the Artificial Analysis Speech Arena Leaderboard, surpassing Inworld Realtime TTS 1.5 Max and Google's Gemini 3.1 Flash TTS
Sonic-3.5 is the latest TTS model from @cartesia . It supports 42 languages, including 9 Indian languages, with 500+ voices available out of the box. The model has been highly preferred among voters in the TTS Arena, with its demonstrated naturalness and accurate transcript following.
Key takeaways: ➤ Quality: Sonic-3.5 has an Elo score of 1,218 (+16/-16) based on 1,144 arena appearances, placing it ahead of Inworld Realtime TTS 1.5 Max at 1,194 and Gemini 3.1 Flash TTS at 1,209