gpt-realtime-2 是一个出色的语音模型(名字却沿袭了OpenAI一贯的糟糕风格)。 语音模型本质上是处理语音,而非转录语音,因此模型的智能程度至关重要。 旧版语音模型是 GPT-4o 级别,而新版则智能得多(有多智能?OpenAI未提供基准测试数据)。
gpt-realtime-2 is a great voice model (with a typically bad OpenAI name). Voice models are natively processing speech, not transcribing it, so the intelligence of the model matters. The old voice model was GPT-4o level, this is much smarter (how smart? OpenAI gave no benchmarks)