Google's Gemini Robotics-ER 1.6 Gives Robots Better Brains
DeepMind's Gemini Robotics-ER 1.6 upgrades embodied AI with multi-angle success detection, industrial gauge reading, and superior spatial reasoning.
Google DeepMind announced the April 14, 2026 release of Gemini Robotics-ER 1.6, an upgraded reasoning engine for physical agents. The model serves as a central brain for robotics hardware. It handles multi-step planning and tool orchestration for real-world tasks. If you build autonomous systems, this release changes how your hardware interprets spatial relationships and verifies its own physical actions.
Spatial Reasoning and Visual Analysis
The architecture improves upon Gemini Robotics-ER 1.5 and Gemini 3.0 Flash in core spatial tasks. Hardware running the model demonstrates higher accuracy in pointing, counting, and calculating complex spatial relationships. An agent can now reliably identify all discrete objects capable of fitting inside a specific container.
DeepMind developed new instrument reading capabilities in collaboration with Boston Dynamics. The model uses agentic vision to read industrial gauges, level indicators, and digital displays. It zooms into specific image regions and uses code execution to estimate scale intervals. Boston Dynamics integrated these features into its AIVI-Learning system. Spot robots now use the model to autonomously patrol facilities, count pallets, and detect liquid spills.
Multi-Perspective Success Detection
Physical environments frequently obstruct a single camera’s line of sight. Gemini Robotics-ER 1.6 introduces multi-perspective success detection to solve this hardware constraint. Robots can analyze video feeds from multiple camera angles simultaneously to verify physical states.
This continuous visual feedback loop allows the agent to autonomously confirm task completion. The system makes logical decisions by calling specialized tools like Vision-Language-Action (VLA) models or Google Search. If a multi-angle visual check fails, the robot decides whether to retry the exact physical step or proceed to a fallback stage.
Safety Benchmark Results
Embodied models face unique physical risk profiles compared to text-only systems. DeepMind tested the model against safety risk scenarios based on real-world injury reports. The system demonstrated strict compliance with safety policies during adversarial spatial reasoning tasks.
| Safety Risk Benchmark | Gemini 3.0 Flash | Gemini Robotics-ER 1.6 |
|---|---|---|
| Text Scenarios | Baseline | +6% improvement |
| Video Scenarios | Baseline | +10% improvement |
API Availability and Migration
The model is accessible to developers through Google AI Studio and the Gemini API as gemini-robotics-er-1-6-preview. DeepMind is actively seeking community input to address persistent AI agent edge cases. The engineering team requested submissions of up to 50 mode failure photos to refine the training data. Early user testing indicates the model still struggles with niche visual tasks like reading analog clocks or interpreting standard digital screenshots.
You have a brief migration window if your application relies on the previous version. Google will decommission the gemini-robotics-er-1-5-preview endpoint on April 30, 2026.
Update your API calls to the 1.6 preview endpoint before the April 30 cutoff to maintain service availability. You should immediately test your existing multi-step workflows against the new multi-perspective vision capabilities, as the improved spatial reasoning alters how your agent calculates task completion.
Get Insanely Good at AI
The book for developers who want to understand how AI actually works. LLMs, prompt engineering, RAG, AI agents, and production systems.
Keep Reading
How to Build Advanced AI Agents with OpenClaw v2026
Learn to master OpenClaw v2026.3.22 by configuring reasoning files, integrating ClawHub skills, and deploying secure agent sandboxes.
Gemini 3.1 Flash Live Launches for Real-Time Audio AI
Google launched Gemini 3.1 Flash Live, a low-latency audio-to-audio model for real-time dialogue, voice agents, and Search Live.
Gemma 4 Arrives With Full Apache 2.0 License
Google releases Gemma 4, a new generation of open models optimized for advanced reasoning, agentic workflows, and high-performance edge deployment.
Google Releases Veo 3.1 Lite for Low-Cost Video Generation via Gemini API
Google's new Veo 3.1 Lite model offers cost-effective 720p and 1080p video generation with native audio via the Gemini API and Google AI Studio.
Google DeepMind Releases AI Manipulation Toolkit
DeepMind's new toolkit uses human-in-the-loop studies to measure how AI models exploit cognitive vulnerabilities and identifies key manipulation tactics.