Google Ships 9 Gemini Omni Demos Alongside 3.5 Flash
Google has released nine demonstration videos showcasing Gemini Omni's physics-aware video generation and the benchmark results for Gemini 3.5 Flash.
Google followed its I/O 2026 developer conference by releasing nine demonstration videos of Gemini Omni and Gemini 3.5 Flash in action on May 29. The releases transition the event’s announcements into concrete production targets for developers building long-horizon applications. Gemini 3.5 Flash is now generally available as the default engine for the Gemini app, while Omni introduces a natively multimodal architecture trained to simulate physical environments.
Gemini 3.5 Flash Pricing and Performance
The stable release of gemini-3.5-flash replaces Gemini 3.1 Pro as the baseline for Google Search’s AI Mode. Google engineered the model specifically for long-horizon task execution and coding. It processes output tokens four times faster than frontier equivalents in its tier.
The capability upgrade introduces a steep cost increase. Developers migrating from the previous Gemini 3 Flash Preview face a 3x price hike, fundamentally changing the unit economics for high-volume retrieval applications.
| Specification | Value |
|---|---|
| Context Window | 1,048,576 input tokens |
| Max Output | 65,536 tokens |
| Input Price | $1.50 per 1M tokens |
| Output Price | $9.00 per 1M tokens |
| Knowledge Cutoff | January 2025 |
Benchmark results reflect the focus on complex execution workflows. The model scored 76.2% on Terminal-Bench 2.1 and 83.6% on MCP Atlas. Multimodal understanding capabilities reached 84.2% on CharXiv. The larger Gemini 3.5 Pro remains in internal testing and will reach developers in June 2026.
Gemini Omni Architecture
Gemini Omni operates as a world model capable of conversational video generation and editing. Omni processes text, audio, images, and video simultaneously as a single unified input. This native multimodality contrasts with pipeline approaches where inputs are translated sequentially before processing.
The model learns and applies physical laws directly to its outputs. The demonstrations highlight Omni simulating fluid dynamics, kinetic energy, and gravity. Users interact with the model to alter specific variables within a generated scene, such as swapping backgrounds or character clothing through natural language prompts.
All outputs carry a SynthID digital watermark to verify machine generation. Gemini Omni Flash is rolling out sequentially to Google AI Plus, Pro, and Ultra subscribers. It will also serve as the backend for content generation in YouTube Shorts and the YouTube Create application.
Autonomous Agent Infrastructure
The demonstrations highlight Google’s shift toward persistent, background execution. Gemini Spark runs autonomously in the cloud, maintaining state and executing multi-step operations independently of local device connectivity. This architecture requires robust multi-agent systems to orchestrate complex dependencies across platforms.
Google also showcased Antigravity 2.0, a standalone development platform built for autonomous execution. During a demonstration, the platform built an operating system and ported the game “Doom” to the new environment in minutes. Decoupling agent environments handles sandboxed execution safely, similar to strategies used to secure AI agents in production environments.
If you rely on Gemini Flash models for high-volume data processing, review your token budgets against the new $1.50/$9.00 pricing structure. The shift in capabilities supports extensive coding and orchestration tasks, but the 3x cost increase requires strict context management to keep production workflows profitable.
Get Insanely Good at AI
The book for developers who want to understand how AI actually works. LLMs, prompt engineering, RAG, AI agents, and production systems.
Keep Reading
Train Multimodal Sentence Transformers for Visual Retrieval
Learn how to finetune multimodal embedding and reranker models for text, image, and audio using the updated Sentence Transformers library.
Single-Weight Gemini Omni Unifies Multimodal Video Generation
Google's Gemini Omni collapses text, image, audio, and video generation into a single set of model weights to enable conversational video editing.
Gemma 4 Arrives With Full Apache 2.0 License
Google releases Gemma 4, a new generation of open models optimized for advanced reasoning, agentic workflows, and high-performance edge deployment.
Google Releases Veo 3.1 Lite for Low-Cost Video Generation via Gemini API
Google's new Veo 3.1 Lite model offers cost-effective 720p and 1080p video generation with native audio via the Gemini API and Google AI Studio.
Antigravity 2.0 Decouples Agent Environments With Gemini 3.5
Google DeepMind has restructured its experimental agentic IDE into a standalone orchestration platform powered by the new Gemini 3.5 Flash model.