谷歌发布新模型Gemini 3.5 Flash,其在智能指数上提升9分至55分,超越Grok 4.3和Claude Sonnet 4.6,尤其在代理任务和知识真实性(大幅减少幻觉)方面进步显著。输出速度超280 tokens/s,使其位于速度与智能的领先前沿。然而,模型运行成本相比前代增加5.5倍,主要由于输入令牌用量及定价上涨。此外,它在多模态评估MMMU-Pro中取得最高分,支持多模态输入,展现了谷歌的综合优势。
Google's new Gemini 3.5 Flash is the clear leader on the Intelligence vs Speed Pareto frontier and makes large gains on GDPval-AA (real-world agentic tasks), but is 5x the cost of Gemini 3 Flash
@GoogleDeepMind gave us pre-release access to Gemini 3.5 Flash, the latest model in its Flash family, which has traditionally has offered faster, lower-cost alternatives to Gemini Pro models. Gemini 3.5 Flash scores 55 on the Artificial Analysis Intelligence Index, up 9 points from Gemini 3 Flash, driven primarily by agentic performance gains and hallucination reduction. It achieves speeds of over 280 output tokens/s, but higher token usage and token pricing make it over 5x more costly to run the Intelligence Index than Gemini 3 Flash, and 75% more costly than Gemini 3.1 Pro. Gemini 3.5 Flash is $1.50/1M input and $9/1M output tokens, Gemini 3 Flash was $0.5/$3 per 1M input/output tokens, a 3x increase. The rest of the increase was driven by higher token usage when running our benchmarks