# Gemini 3.1 Flash-Lite：专为规模化智能构建

- 来源：Google DeepMind：Blog（RSS）
- 发布时间：2026-03-04 00:35
- AIHOT 链接：https://aihot.virxact.com/items/cmnwsdqak003bslag0ih9rqxz
- 原文链接：https://deepmind.google/blog/gemini-3-1-flash-lite-built-for-intelligence-at-scale

## AI 摘要

Google 发布 Gemini 3.1 Flash-Lite，为 Gemini 3 系列中速度最快、成本效益最高的模型，面向大规模智能应用场景优化。

## 正文

Gemini 3.1 Flash Lite: Our most cost-effective AI model yet

Gemini 3.1 Flash-Lite: Built for intelligence at scale

Mar 03, 2026

· 8 min read

x.comFacebookLinkedInMail

Get best-in-class intelligence for your highest-volume workloads.

T

The Gemini Team

Read AI-generated summary

General summary

Gemini 3.1 Flash-Lite is now available in preview to developers via the Gemini API in Google AI Studio and for enterprises via Vertex AI. Priced at $0.25/1M input tokens and $1.50/1M output tokens, it's cost-efficient and faster than 2.5 Flash. Use 3.1 Flash-Lite for tasks like translation content moderation generating user interfaces and creating simulations.

Summaries were generated by Google AI. Generative AI is experimental.

Basic explainer

Google made a new AI model called Gemini 3.1 Flash-Lite. It's super fast and cheap to use, so more people can use it. This AI is good at things like translating languages and checking content. Some companies are already using it to solve tough problems because it's both smart and efficient.

Summaries were generated by Google AI. Generative AI is experimental.

Explore other styles:

General summary Basic explainer

x.comFacebookLinkedInMail

Audio 1

Listen to article

This content is generated by Google AI. Generative AI is experimental

[[duration]] minutes

Voice Umbriel Speed 1X

Voice Umbriel Gacrux

Speed 0.75X 1X 1.5X 2X

Today, we're introducing Gemini 3.1 Flash-Lite, our fastest and most cost-efficient Gemini 3 series model. Built for high-volume developer workloads at scale, 3.1 Flash-Lite delivers high quality for its price and model tier.

Starting today, 3.1 Flash-Lite is rolling out in preview to developers via the Gemini API in Google AI Studio and for enterprises via Vertex AI.

Cost-efficiency without compromise

Priced at just $0.25/1M input tokens and $1.50/1M output tokens, 3.1 Flash-Lite delivers enhanced performance at a fraction of the cost of larger models. It outperforms 2.5 Flash with a 2.5X faster Time to First Answer Token and 45% increase in output speed, according to the Artificial Analysis benchmark while maintaining similar or better quality. This low latency is needed for high-frequency workflows, making it an ideal model for developers to build responsive, real-time experiences.

Video 1

Gemini 3.1 Flash-Lite outperforms 2.5 Flash in speed and quality.

3.1 Flash-Lite achieves an impressive Elo score of 1432 on the Arena.ai Leaderboard and outperforms other models of similar tier across reasoning and multimodal understanding benchmarks, including 86.9% on GPQA Diamond and 76.8% on MMMU Pro–even surpassing larger Gemini models from prior generations like 2.5 Flash.

Adaptive intelligence at scale for developers

Beyond its raw performance, Gemini 3.1 Flash-Lite comes standard with thinking levels in AI Studio and Vertex AI, giving developers the control and flexibility to select how much the model “thinks” for a task, which is critical for managing high-frequency workloads. 3.1 Flash-Lite can tackle tasks at scale, like high-volume translation and content moderation, where cost is a priority. And it can also handle more complex workloads where more in-depth reasoning is needed, like generating user interfaces and dashboards, creating simulations or following instructions.

Video 2

Read more

3.1 Flash-Lite instantly fills an e-commerce wireframe with hundreds of products in different categories.

Video 3

Read more

3.1 Flash-Lite can generate dynamic weather dashboards in real-time, using live forecasts and historical data.

Video 4

Read more

3.1 Flash-Lite creates a SaaS agent capable of executing versatile, multi-step tasks for a business.

Video 5

Read more

3.1 Flash-Lite can analyze and sort large numbers of content like images quickly.

Jump to position 1 Jump to position 2 Jump to position 3 Jump to position 4

Early-access developers on AI Studio and Vertex AI, and companies like Latitude, Cartwheel and Whering are already using 3.1 Flash-Lite to solve complex problems at scale. Early testers highlighted 3.1 Flash-Lite’s efficiency and reasoning capabilities, saying it can handle complex inputs with the precision of a larger-tier model, plus follow instructions and maintain adherence.

Jump to position 1 Jump to position 2 Jump to position 3 Jump to position 4

We look forward to seeing what you build with 3.1 Flash-Lite and the rest of the Gemini 3 series models.

Done. Just one step more.

Check your inbox to confirm your subscription.

You are already subscribed to our newsletter.

You can also subscribe with a different email address .

POSTED IN:

Related stories

Global Network #### Our new community investments in Virginia support local jobs and expand energy affordability. Jun 11, 2026

Gemini models #### Fluid, natural voice translation with Gemini 3.5 Live Translate By Anuda Weerasinghe & Tony Lu Jun 09, 2026

AI #### The latest AI news we announced in May 2026 By The Keyword Team Jun 05, 2026

Search #### 5 ways Google Search can level up your thrift and vintage shopping By Megan Stoner Jun 03, 2026

AI #### How we used Gemini to build Google I/O 2026 By Marvin Chow Jun 01, 2026

AI #### Take our I/O 2026 quiz, vibe coded in Google AI Studio. By Zahra Thompson May 29, 2026

.

Jump to position 1 Jump to position 2 Jump to position 3 Jump to position 4 Jump to position 5 Jump to position 6

Let’s stay in touch. Get the latest news from Google in your inbox.

SubscribeNo thanks
