Google's Gemini Robotics-ER 1.6 Gives Robots Better Brains

Google DeepMind announced the April 14, 2026 release of Gemini Robotics-ER 1.6, an upgraded reasoning engine for physical agents. The model serves as a central brain for robotics hardware. It handles multi-step planning and tool orchestration for real-world tasks. If you build autonomous systems, this release changes how your hardware interprets spatial relationships and verifies its own physical actions.

Spatial Reasoning and Visual Analysis

The architecture improves upon Gemini Robotics-ER 1.5 and Gemini 3.0 Flash in core spatial tasks. Hardware running the model demonstrates higher accuracy in pointing, counting, and calculating complex spatial relationships. An agent can now reliably identify all discrete objects capable of fitting inside a specific container.

DeepMind developed new instrument reading capabilities in collaboration with Boston Dynamics. The model uses agentic vision to read industrial gauges, level indicators, and digital displays. It zooms into specific image regions and uses code execution to estimate scale intervals. Boston Dynamics integrated these features into its AIVI-Learning system. Spot robots now use the model to autonomously patrol facilities, count pallets, and detect liquid spills.

Multi-Perspective Success Detection

Physical environments frequently obstruct a single camera’s line of sight. Gemini Robotics-ER 1.6 introduces multi-perspective success detection to solve this hardware constraint. Robots can analyze video feeds from multiple camera angles simultaneously to verify physical states.

This continuous visual feedback loop allows the agent to autonomously confirm task completion. The system makes logical decisions by calling specialized tools like Vision-Language-Action (VLA) models or Google Search. If a multi-angle visual check fails, the robot decides whether to retry the exact physical step or proceed to a fallback stage.

Safety Benchmark Results

Embodied models face unique physical risk profiles compared to text-only systems. DeepMind tested the model against safety risk scenarios based on real-world injury reports. The system demonstrated strict compliance with safety policies during adversarial spatial reasoning tasks.

Safety Risk Benchmark	Gemini 3.0 Flash	Gemini Robotics-ER 1.6
Text Scenarios	Baseline	+6% improvement
Video Scenarios	Baseline	+10% improvement

API Availability and Migration

The model is accessible to developers through Google AI Studio and the Gemini API as gemini-robotics-er-1-6-preview. DeepMind is actively seeking community input to address persistent AI agent edge cases. The engineering team requested submissions of up to 50 mode failure photos to refine the training data. Early user testing indicates the model still struggles with niche visual tasks like reading analog clocks or interpreting standard digital screenshots.

You have a brief migration window if your application relies on the previous version. Google will decommission the gemini-robotics-er-1-5-preview endpoint on April 30, 2026.

Update your API calls to the 1.6 preview endpoint before the April 30 cutoff to maintain service availability. You should immediately test your existing multi-step workflows against the new multi-perspective vision capabilities, as the improved spatial reasoning alters how your agent calculates task completion.

Google's Gemini Robotics-ER 1.6 Gives Robots Better Brains

Spatial Reasoning and Visual Analysis

Multi-Perspective Success Detection

Safety Benchmark Results

API Availability and Migration

Keep Reading

How to Build Advanced AI Agents with OpenClaw v2026

Gemini 3.1 Flash Live Launches for Real-Time Audio AI

Gemma 4 Arrives With Full Apache 2.0 License

Google Releases Veo 3.1 Lite for Low-Cost Video Generation via Gemini API

Google DeepMind Releases AI Manipulation Toolkit