Gemini 3.1 Flash-Lite: Built for intelligence at scale

2026-03-17T02:59:31.000Z·★ 100·1 min read
Gemini 3.1 Flash-Lite delivers 2.5X faster first-token speed than 2.5 Flash at $0.25/1M input tokens, with state-of-the-art benchmark scores for its tier.

Google launches Gemini 3.1 Flash-Lite, its fastest and most cost-efficient Gemini 3 series model, priced at $0.25/1M input tokens with 2.5X faster Time to First Token than 2.5 Flash.

Cost-Efficiency Without Compromise

Priced at just $0.25 per million input tokens and $1.50 per million output tokens, 3.1 Flash-Lite delivers enhanced performance at a fraction of the cost of larger models. It outperforms 2.5 Flash with:

Adaptive Intelligence for Developers

3.1 Flash-Lite comes with configurable thinking levels, giving developers control over how much reasoning the model applies per task. This makes it versatile for both simple and complex workloads:

Early adopters including Latitude, Cartwheel, and Whering are already using 3.1 Flash-Lite at scale.

Availability

Available in preview via Gemini API in Google AI Studio and Vertex AI for enterprises.


Source: Google DeepMind Blog

↗ Original source
← Previous: From games to biology and beyond: 10 years of AlphaGo's impactNext: Nano Banana 2: Combining Pro capabilities with lightning-fast speed →
Comments0