TestingCatalog News 🗞@testingcatalog

2026-05-06 00:48·58天前

AI 摘要

Inworld AI发布了新一代实时对话语音模型Realtime TTS-2。该模型的核心突破在于，能在说话前处理完整的多轮对话音频上下文，从而像真人一样实时适应对话情境。其关键特性包括：单一音色支持超过100种语言，首次音频生成延迟低于200毫秒，并能通过自然语言指令调整语音风格，无需预设情感标签。这标志着语音AI首次具备了“聆听”对话整体氛围而不仅是字面内容的能力，其架构设计旨在实现既自然动听又富有情境感知的对话体验。

Inworld AI released Realtime TTS-2， a text-to-speech model that processes the full audio context of multi-turn exchanges before it speaks， adapting to the moment the way a person would.

One voice identity across 100+ languages.

Sub-200ms time-to-first-audio.

Natural-language voice direction， no emotion tag presets.

AI that hears how you sound， not only what you say， is now a real architecture decision.

Inworld AIIntroducing Realtime TTS-2, a new generation of voice model built for realtime conversation. It is the first voice model that hears the conversation, takes natu...

产品更新语音

在 X 查看原推导出 Markdown

TestingCatalog News 🗞@testingcatalog · X

69导出 Markdown