# Inworld AI发布新一代实时对话语音模型Realtime TTS-2

- 来源：TestingCatalog News 🗞 (@testingcatalog)
- 发布时间：2026-05-06 00:48
- AIHOT 分数：69
- AIHOT 链接：https://aihot.virxact.com/items/cmosvjto400ntsldm6am58yk4
- 原文链接：https://x.com/testingcatalog/status/2051705563403198511

## AI 摘要

Inworld AI发布了新一代实时对话语音模型Realtime TTS-2。该模型的核心突破在于，能在说话前处理完整的多轮对话音频上下文，从而像真人一样实时适应对话情境。其关键特性包括：单一音色支持超过100种语言，首次音频生成延迟低于200毫秒，并能通过自然语言指令调整语音风格，无需预设情感标签。这标志着语音AI首次具备了“聆听”对话整体氛围而不仅是字面内容的能力，其架构设计旨在实现既自然动听又富有情境感知的对话体验。

## 正文

Inworld AI released Realtime TTS-2， a text-to-speech model that processes the full audio context of multi-turn exchanges before it speaks， adapting to the moment the way a person would.

> One voice identity across 100+ languages.

> Sub-200ms time-to-first-audio.

> Natural-language voice direction， no emotion tag presets.

AI that hears how you sound， not only what you say， is now a real architecture decision.

### 引用推文

> Inworld AI：Introducing Realtime TTS-2, a new generation of voice model built for realtime conversation. It is the first voice model that hears the conversation, takes natu...
