Artificial Analysis 发布 Controlled Voice Arena,通过语音克隆标准化 8 种声音(2 美男、2 美女、2 英男、2 英女),评估 TTS 模型的音频质量、发音、节奏与语调,分离声音偏好与模型质量。每个模型基于同一 1-2 分钟录音进行克隆。投票已开放,本周公布首批排行榜。
Announcing the Artificial Analysis Controlled Voice Arena - compare Text to Speech models on the same set of 8 cloned voices
The Controlled Voice Arena standardizes, through voice cloning, the set of voices that each model's performance is evaluated on - separating specific voice preference from broader aspects of model quality, e.g., audio quality, pronunciation, pacing and tone. It complements our Provider Voice Arena, where each model uses a select set of its own available voices.
We have generated speech samples on models that offer voice cloning abilities using the same voice categories as our existing Provider Voice Arena, namely: 2 US Male voices, 2 US Female voices, 2 UK Male voices, 2 UK Female voices. Each model has been cloned on the same 1-2 minute audio recordings for each voice.