# Gemini 3.2 Flash性能逼近GPT-5.5，成本大降

- 来源：Chubby♨️ (@kimmonismus)
- 发布时间：2026-05-14 19:33
- AIHOT 分数：58
- AIHOT 链接：https://aihot.virxact.com/items/cmp5fu7nc0dw5sljx7vu5hopq
- 原文链接：https://x.com/kimmonismus/status/2054887891222802633

## AI 摘要

传闻即将发布的Gemini 3.2 Flash模型在编码和推理任务上达到了GPT-5.5约92%的性能水平，同时推理成本降低了15至20倍。其延迟表现也极为出色，多数查询响应时间低于200毫秒。这主要得益于DeepMind的蒸馏和稀疏化技术，成功将前沿模型压缩为“Flash”变体，而避免了通常伴随的质量大幅下降。

## 正文

Rumors about the new Gemini Flash coming in. And holy， if true then big：

92% of GPT-5.5's coding and reasoning performance， reportedly at 15-20x lower inference cost. And the latency？ Sub-200ms for most queries.

That would be nuts. no joke.

### 引用推文

> Bindu Reddy：Gemini 3.2 Flash - Capitalizing on DeepMind's clever distillation techniques... Rumors are that benchmarks show it's hitting 92% of GPT 5.5's performance on cod...