Google Cloud Releases Agents CLI for Agent Platform
Google Cloud has introduced Agents CLI, a command-line tool that gives AI coding assistants a machine-readable interface for agent scaffolding and deployment.
On April 22, 2026, Google Cloud released Agents CLI, a specialized command-line interface integrated into the Gemini Enterprise Agent Platform. The tool is engineered to provide AI coding assistants, including Gemini CLI, Claude Code, and Cursor, with a machine-readable interface to the complete Google Cloud agent stack. For developers managing the Agent Development Lifecycle (ADLC), this allows local development environments to drive production deployments autonomously.
Installation and Bundled Skills
The interface is initialized through a core setup command, invoked via uvx google-agents-cli setup. To mitigate token waste and context overload caused by assistants reading extensive documentation, Google bundles direct API references as Skills. Developers inject these into their local coding environment using npx skills add google/agents-cli.
This package provides the AI assistant with direct sensory input and reference material for the Agent Development Kit (ADK). Utilizing agent skills correctly ensures the AI assistant maintains focus on implementation rather than searching for syntax rules.
Execution Modes and Deployment Targets
Agents CLI operates in two distinct modes. Agent Mode is optimized strictly for AI assistants to automate processes like unit testing and evaluation. Human Mode provides deterministic execution for human developers requiring manual terminal control.
The CLI automates Infrastructure as Code (IaC) injection and continuous deployment across three primary production targets:
| Deployment Target | Description | Infrastructure Requirement |
|---|---|---|
| Agent Runtime | Deploys source code as a tarball to Vertex AI reasoning engines. | Bypasses Dockerfile requirements. |
| Cloud Run | Deploys agents to standard containerized infrastructure. | Requires container configuration. |
| GKE Autopilot | Deploys agents to managed Kubernetes environments. | Built for enterprise-scale workloads. |
Ecosystem Integration
The release aligns with the broader consolidation of Vertex AI into the Gemini Enterprise Agent Platform. The CLI registers agents natively using agents-cli publish. It orchestrates Subagents, allowing developers to modularize specific tasks like codebase investigation within the framework. Structuring and configuring subagents in Gemini CLI is required to handle isolated, complex operations before compilation.
Agents CLI and the ADK ship with native support for the Model Context Protocol, allowing agents to discover system resources dynamically through Agent Cards. While optimized for Google Cloud infrastructure, the framework maintains multi-cloud flexibility. It includes LiteLLM as an escape hatch, permitting agents built via the CLI to execute inference using models like GPT-4o and Claude 3.5 Sonnet.
Governance and Security Controls
Runtime security is handled natively by Agent Gateway and Model Armor. These components enforce prompt injection defenses and data leakage policies automatically upon deployment. Technical analysis of the framework indicates that while the native governance is robust, Agent Gateway currently lacks a customer-pluggable webhook layer for external third-party policy enforcement.
If you build agents using AI coding assistants, transition your setup routines to utilize Agents CLI and its bundled skills. This integration standardizes your infrastructure pipeline and reduces the context window overhead required to map local code to Google Cloud deployment targets.
Get Insanely Good at AI
The book for developers who want to understand how AI actually works. LLMs, prompt engineering, RAG, AI agents, and production systems.
Keep Reading
Google Graduates LiteRT NPU Acceleration to Production
Learn how to configure LiteRT for hardware-accelerated on-device AI inference using Google's production-ready NPU capabilities.
Google launches TPU 8t for training and TPU 8i for inference
Google's eighth-generation TPUs split into the 8t for frontier training and the 8i for low-latency inference, with Broadcom and MediaTek as fab partners.
Boost Model Accuracy With MaxText Post-Training on TPUs
Google's MaxText adds SFT and Reinforcement Learning support for single-host TPUs, enabling efficient LLM refinement with GRPO and Tunix integration.
Intel’s Xeon 6 and Custom IPUs Coming to Google Cloud
Intel and Google expand their partnership to co-develop custom IPUs and deploy Xeon 6 processors for high-performance AI and hyperscale workloads.
Google Gemini API Adds Flex and Priority Tiers for Scale
Google launches Flex and Priority inference tiers for the Gemini API, offering developers new ways to optimize costs and reliability for AI workflows.