# MTP技术助力Qwen模型在Atomic Chat上实现2.5倍加速

- 来源：🚨 AI News | TestingCatalog (@testingcatalog)
- 发布时间：2026-05-21 16:04
- AIHOT 分数：74
- AIHOT 链接：https://aihot.virxact.com/items/cmpf8bgk5048jsljwbajpqkj4
- 原文链接：https://x.com/testingcatalog/status/2057371971118154173

## AI 摘要

新的MTP技术通过提前草拟多个令牌并一次完成验证，使Qwen 3.6模型在Atomic Chat中的运行速度提升高达2.5倍。该技术对Dense模型（如Qwen 3.6 27B）加速显著，速度从51提升至117 tokens/s；而对MoE模型（如Qwen 3.6 35B-A3B）提升相对较小（25%）。MTP实现了约80%的草稿接受率，无精度损失，仅需额外约1GB显存。用户可通过开源的Atomic Chat应用在本地测试该模型。

## 正文

Qwen 3.6 models are now 2.5x times faster on Atomic Chat with new MTP speedups.

> MTP drafts several tokens ahead and verifies them in one pass. The speedup depends on the memory moved per pass.

Users can run Qwen 3.6 models locally via the open-source Atomic Chat to test them！

### 引用推文

> atomic.chat：MTP speedup Qwen by 2.5x in Atomic Chat Dense vs MoE models on 2x RTX 5090 Qwen3.6 27B: 51 → 117 tps +137% Qwen3.6 35B-A3B: 218 → 267 tps +25% MTP drafts severa...
