Gemini Omni Flash Unifies Video Generation at 10 Cents a Second
Google DeepMind has launched Nano Banana 2 Lite for rapid image generation and opened Gemini Omni Flash to developers for unified multimodal video editing.
Google DeepMind released Nano Banana 2 Lite and Gemini Omni Flash to the Gemini ecosystem. The updates reduce latency for image generation and transition video generation to a unified transformer architecture.
High-Velocity Image Generation
Nano Banana 2 Lite operates under the model name gemini-3.1-flash-lite-image. It functions as a direct drop-in replacement for the previous gemini-2.5-flash-image endpoint. The architecture prioritizes execution speed and cost efficiency for production workloads over high-end creative nuance.
Outputs take approximately 4 seconds to generate. This cuts the latency of the original Nano Banana 1 model in half. The model maintains prompt adherence and improves rendering accuracy for complex text inside images.
| Metric | Nano Banana 1 | Nano Banana 2 Lite |
|---|---|---|
| Model Name | gemini-2.5-flash-image | gemini-3.1-flash-lite-image |
| Generation Time | ~8 seconds | ~4 seconds |
| Cost per Image | Not specified | $0.034 |
For developers attempting to reduce LLM API costs in production, the 3.4-cent unit cost alters the constraints around automated image generation at scale. The endpoint is generally available in Google AI Studio and the Gemini API.
Unified Architecture for Video
Gemini Omni Flash replaces Veo 3.1 as the primary video generation engine. Previous iterations relied on a split-stack pipeline that passed data between dedicated text, image, and video models. Omni Flash processes all modalities simultaneously in a single forward pass.
The model supports conversational video editing. Developers can pass natural language prompts to swap characters, adjust lighting, alter camera angles, or swap background environments. The multimodal input accepts up to five reference photos to lock visual consistency for specific characters or objects. If your system relies on cross-modal RAG pipelines, you can now feed retrieved images directly into the video generation context.
Outputs include natively synchronized audio tracks generated alongside the visual frames. Clips face a hard limit of 10 seconds per generation. Google estimates pricing at $0.10 per second of video output during the current public preview phase. All generated media includes SynthID digital watermarking and C2PA credentials.
Enterprise Availability
Both models immediately support integration with the Gemini Enterprise Agent Platform to enable visual workflows for autonomous systems. They also power Google Flow, a new creative production workspace, and serve as background generation tools for YouTube Shorts creators.
If your codebase calls gemini-2.5-flash-image, update your routing logic to gemini-3.1-flash-lite-image to capture the latency improvements. For video generation workloads, audit your pipeline constraints to account for the Omni Flash 10-second clip cap before migrating away from existing Veo configurations.
Get Insanely Good at AI
The book for developers who want to understand how AI actually works. LLMs, prompt engineering, RAG, AI agents, and production systems.
Keep Reading
How to Automate Google Pay Integrations With MCP
Connect your AI development environment to real-time merchant data and documentation using the new Google Pay and Wallet Developer MCP server.
Google Ships 9 Gemini Omni Demos Alongside 3.5 Flash
Google has released nine demonstration videos showcasing Gemini Omni's physics-aware video generation and the benchmark results for Gemini 3.5 Flash.
Single-Weight Gemini Omni Unifies Multimodal Video Generation
Google's Gemini Omni collapses text, image, audio, and video generation into a single set of model weights to enable conversational video editing.
Gemma 4 Arrives With Full Apache 2.0 License
Google releases Gemma 4, a new generation of open models optimized for advanced reasoning, agentic workflows, and high-performance edge deployment.
Google Releases Veo 3.1 Lite for Low-Cost Video Generation via Gemini API
Google's new Veo 3.1 Lite model offers cost-effective 720p and 1080p video generation with native audio via the Gemini API and Google AI Studio.