# 微软发布MAI-Transcribe-1语音转录模型，准确率达3.0%

- 来源：Artificial Analysis (@ArtificialAnlys)
- 发布时间：2026-04-03 08:29
- AIHOT 分数：56
- AIHOT 链接：https://aihot.virxact.com/items/cmnw1ypby00oaslc3pwavtc2v
- 原文链接：https://x.com/ArtificialAnlys/status/2039862705096659050

## AI 摘要

微软AI超级智能团队发布了MAI-Transcribe-1语音转录模型。该模型在Artificial Analysis语音转文本排行榜的AA-WER指标上达到3.0%的词错误率，位列第四，仅次于Mistral Voxtral Small、Google Gemini 3.1 Pro High和ElevenLabs Scribe v2。其处理速度约为实时音频的69倍，属于高速高精度模型。模型支持包括英语、法语、阿拉伯语、日语和中文在内的25种语言，其API目前已在Microsoft Foundry的Azure Speech服务上提供公开预览。

## 正文

Microsoft has released MAI-Transcribe-1： a speech transcription model achieving 3.0% on AA-WER （#4）， and is fast at 69x real-time

The model was developed by Microsoft AI （MAI）'s Superintelligence team and supports 25 languages including English， French， Arabic， Japanese， and Chinese. MAI-Transcribe-1 API is currently available in public preview via Azure Speech on Microsoft Foundry.

On the Artificial Analysis Speech to Text （STT） leaderboard， MAI-Transcribe-1 achieves a 3.0% word error rate on AA-WER for speech transcription accuracy， positioning it 4th overall behind Mistral's Voxtral Small （2.9% AA-WER）， Google's Gemini 3.1 Pro High （2.9% AA-WER） and ElevenLabs' Scribe v2 （2.3% AA-WER）. It also stands out as one of the faster high-accuracy transcription models available， processing audio at ~69x real-time.

See more details below ⬇️
