OpenBMB发布开源TTS模型VoxCPM 2,仅2B参数支持30种语言,无需语言标签即可生成语音。Apache-2.0许可,8GB显存可运行。支持文本描述创建新声音、可控克隆与终极克隆,保留说话人细节。输出48kHz音质,RTX 4090实时推理达0.3 RTF。兼容PyTorch、LoRA微调及Nano-VLLM部署,适用于影视、游戏、有声书等专业场景。
VoxCPM 2 just dropped by @OpenBMB
Only 2B-param open-source TTS (Text-to-Speech) model built for production-grade multilingual voice work.
Apache-2.0 license, Can run on only 8GB VRAM.
• Eliminates the "robotic" feel of traditional TTS, delivering prosody and emotional depth suitable for high-stakes professional environments like filmmaking, gaming, animation, and audiobooks.
• 30-language multilingual: no language tag needed, just type in a supported language and generate directly.
• Voice design: create a brand-new voice from a text description alone, like age, tone, pace, or emotion. No reference audio required. Describe the desired voice characteristics (gender, age, tone, emotion, pace …) in Control Instruction, and VoxCPM2 will craft a unique voice from your description alone.