News

Gemini 3.1 Flash-Lite: Built for intelligence at scale

The Gemini Team

2026.03.03

·Web·by 이호민

#AI#Gemini#Google AI#LLM#Vertex AI

Key Points

1Google has introduced Gemini 3.1 Flash-Lite, a new AI model now available in preview via the Gemini API and Vertex AI, designed for high-volume workloads at significantly lower costs.
2This model boasts a 2.5X faster Time to First Answer Token and 45% increased output speed compared to 2.5 Flash, while achieving high quality with an Elo score of 1432 and strong performance on reasoning and multimodal benchmarks.
3Gemini 3.1 Flash-Lite is suited for tasks from high-volume translation and content moderation to complex UI generation and simulations, offering developers adaptive intelligence and "thinking levels" control for efficient, real-time applications.

0.25 per 1 million input tokens and

News

The Gemini Team

2026.03.03

·Web·by 이호민

#AI#Gemini#Google AI#LLM#Vertex AI

1Google has introduced Gemini 3.1 Flash-Lite, a new AI model now available in preview via the Gemini API and Vertex AI, designed for high-volume workloads at significantly lower costs.
2This model boasts a 2.5X faster Time to First Answer Token and 45% increased output speed compared to 2.5 Flash, while achieving high quality with an Elo score of 1432 and strong performance on reasoning and multimodal benchmarks.
3Gemini 3.1 Flash-Lite is suited for tasks from high-volume translation and content moderation to complex UI generation and simulations, offering developers adaptive intelligence and "thinking levels" control for efficient, real-time applications.

0.25 per 1 million input tokens and