How to Control Agent Tool Execution via Genkit Middleware
Learn how to use Google's new Genkit Middleware to intercept model calls, implement human-in-the-loop tool approvals, and handle transient API failures.
Google Developers AI recently released Genkit Middleware, a composable framework designed to harden applications by giving developers deterministic control over non-deterministic model outputs. You can use it to intercept the execution loop, handle transient API errors, and enforce strict execution boundaries.
As of the May 14, 2026 release, Genkit Middleware is fully available for TypeScript, Go, and Dart, with Python support currently in preview. The system supports models from major providers including Google, OpenAI, Anthropic, xAI, and DeepSeek.
The Three Hook Layers
Genkit Middleware operates by attaching hooks at three distinct layers of the Genkit generate() tool loop. This separation of concerns allows you to target specific parts of the request lifecycle without building monolithic interceptors.
generateHook (High-Level): This layer wraps the entire generation loop. It encompasses prompting, tool calling, and output parsing. You configure this hook for context injection, message rewriting, and managing conversation-level logic.modelHook (API-Level): This layer intercepts every individual model API call. It is the designated insertion point for retries, model fallbacks, request caching, and latency logging.toolHook (Execution-Level): This layer intercepts specific tool executions before they resolve. You use this hook to implement human-in-the-loop (HITL) approvals, enforce sandboxing, generate audit logs, and validate per-tool inputs and outputs.
Pre-Built Middleware Solutions
If you are working in JavaScript or TypeScript, Google provides the @genkit-ai/middleware package. It includes several official interceptors that address common reliability and safety requirements when building AI agents.
| Middleware | Target Layer | Primary Function |
|---|---|---|
| Retry | model | Automatically handles transient model errors (like RESOURCE_EXHAUSTED or UNAVAILABLE) using exponential backoff with jitter. |
| Fallback | model | Switches to a secondary model (e.g., from Gemini 3 Pro to Gemini 3 Flash) if the primary model fails or hits hard quota limits. |
| Tool Approval | tool | Pauses execution to require human intervention before destructive or sensitive tool calls are made. |
| FileSystem | tool | Grants models restricted access to a local directory for file manipulation (list, read, write, search-and-replace) with built-in safety boundaries. |
| Skills | generate | Scans for SKILL.md files to inject specialized system instructions and provides a use_skill tool for on-demand retrieval. |
For configuration syntax, parameter references, and language-specific setup instructions, refer directly to the Genkit Middleware documentation.
Implementing Human-in-the-Loop Safeguards
Moving an application from a stateless chatbot to a production-ready system capable of executing complex workflows requires strict operational boundaries. The Tool Approval middleware serves this exact purpose.
When you attach the Tool Approval interceptor to a destructive tool—such as a database deletion routine or a financial transaction API—the Genkit execution engine pauses right before the tool executes. The runtime waits for an external approval signal. If the human operator rejects the payload, the middleware throws an exception back to the model, allowing the agent to update its execution plan or halt entirely.
Tracing and Debugging Configuration
Because middleware relies on an “onion-style” stacking order, execution flow can become difficult to trace as you add more interceptors. Genkit provides visibility into this stack via the Genkit Dev UI.
When you register middleware in your application, the Dev UI automatically maps the configurations. You can inspect the execution trace through each hook layer, verifying exactly where a request was modified, cached, or rejected. This visual tracing is especially useful when evaluating and testing AI agents that utilize nested fallbacks and multi-step retries.
Start by installing the middleware package for your target language and testing the Tool Approval interceptor on your most sensitive tools to establish a baseline of security.
Get Insanely Good at AI
The book for developers who want to understand how AI actually works. LLMs, prompt engineering, RAG, AI agents, and production systems.
Keep Reading
Osaurus Pivots to Unified macOS Agent Platform With Linux VMs
The open-source Osaurus app now routes local MLX models and cloud APIs through a hardware-isolated agent harness natively built for Apple Silicon.
Android 17 Integrates OS-Level Gemini Agentic Automation
Google previewed Android 17, introducing cross-app Gemini agents, generative UI widgets, and a biometric lockout feature for lost devices.
Agent View Brings Parallel Task Orchestration to Claude Code
The May 2026 update to Claude Code introduces Agent view, a centralized dashboard for backgrounding, monitoring, and interacting with parallel agent workflows.
Claude Microsoft 365 Add-Ins Unify Agent Context Across Apps
Anthropic has released Claude for Microsoft 365 in general availability, introducing a persistent agent context across Excel, Word, and PowerPoint.
128B Mistral Medium 3.5 Moves Vibe Coding Agents to the Cloud
Mistral AI's new 128-billion parameter dense model introduces configurable reasoning alongside asynchronous cloud-based execution for coding agents.