# Cartesia的Sonic-3.5语音合成模型在AI评测榜夺冠

- 来源：Artificial Analysis (@ArtificialAnlys)
- 发布时间：2026-05-23 01:36
- AIHOT 分数：61
- AIHOT 链接：https://aihot.virxact.com/items/cmph8d0ng0m78sljwt23xs7tg
- 原文链接：https://x.com/ArtificialAnlys/status/2057878247782908109

## AI 摘要

Cartesia公司最新发布的语音合成模型Sonic-3.5在Artificial Analysis Speech Arena排行榜上位居第一，超越了Inworld Realtime TTS 1.5 Max和Google Gemini 3.1 Flash TTS等竞品。该模型支持42种语言（包括9种印度语言），提供超过500种声音选择。评测数据显示，Sonic-3.5以1,218的Elo分数领先，表现出自然的语音效果和准确的文本跟随能力。其定价为每百万字符39美元，高于竞品；生成速度为每秒105.5字符，介于其他两者之间。

## 正文

Cartesia's Sonic-3.5 takes the #1 spot on the Artificial Analysis Speech Arena Leaderboard， surpassing Inworld Realtime TTS 1.5 Max and Google's Gemini 3.1 Flash TTS

Sonic-3.5 is the latest TTS model from @cartesia . It supports 42 languages， including 9 Indian languages， with 500+ voices available out of the box. The model has been highly preferred among voters in the TTS Arena， with its demonstrated naturalness and accurate transcript following.

Key takeaways：
➤ Quality： Sonic-3.5 has an Elo score of 1，218 （+16/-16） based on 1，144 arena appearances， placing it ahead of Inworld Realtime TTS 1.5 Max at 1，194 and Gemini 3.1 Flash TTS at 1，209

➤ Pricing： Sonic-3.5 is priced at $39/1M characters， a premium compared to Gemini 3.1 Flash TTS at $18.3/1M characters， and Inworld Realtime TTS 1.5 Max at $35/1M characters

➤ Speed： 105.5 characters per second， compared to 205 characters per second for Inworld Realtime TTS 1.5 Max and 26.3 characters per second for Gemini 3.1 Flash TTS

See more details and listen to samples below 🧵
