Ai Agents 4 min read

How to Control Agent Tool Execution via Genkit Middleware

Learn how to use Google's new Genkit Middleware to intercept model calls, implement human-in-the-loop tool approvals, and handle transient API failures.

Google Developers AI recently released Genkit Middleware, a composable framework designed to harden applications by giving developers deterministic control over non-deterministic model outputs. You can use it to intercept the execution loop, handle transient API errors, and enforce strict execution boundaries.

As of the May 14, 2026 release, Genkit Middleware is fully available for TypeScript, Go, and Dart, with Python support currently in preview. The system supports models from major providers including Google, OpenAI, Anthropic, xAI, and DeepSeek.

The Three Hook Layers

Genkit Middleware operates by attaching hooks at three distinct layers of the Genkit generate() tool loop. This separation of concerns allows you to target specific parts of the request lifecycle without building monolithic interceptors.

  • generate Hook (High-Level): This layer wraps the entire generation loop. It encompasses prompting, tool calling, and output parsing. You configure this hook for context injection, message rewriting, and managing conversation-level logic.
  • model Hook (API-Level): This layer intercepts every individual model API call. It is the designated insertion point for retries, model fallbacks, request caching, and latency logging.
  • tool Hook (Execution-Level): This layer intercepts specific tool executions before they resolve. You use this hook to implement human-in-the-loop (HITL) approvals, enforce sandboxing, generate audit logs, and validate per-tool inputs and outputs.

Pre-Built Middleware Solutions

If you are working in JavaScript or TypeScript, Google provides the @genkit-ai/middleware package. It includes several official interceptors that address common reliability and safety requirements when building AI agents.

MiddlewareTarget LayerPrimary Function
RetrymodelAutomatically handles transient model errors (like RESOURCE_EXHAUSTED or UNAVAILABLE) using exponential backoff with jitter.
FallbackmodelSwitches to a secondary model (e.g., from Gemini 3 Pro to Gemini 3 Flash) if the primary model fails or hits hard quota limits.
Tool ApprovaltoolPauses execution to require human intervention before destructive or sensitive tool calls are made.
FileSystemtoolGrants models restricted access to a local directory for file manipulation (list, read, write, search-and-replace) with built-in safety boundaries.
SkillsgenerateScans for SKILL.md files to inject specialized system instructions and provides a use_skill tool for on-demand retrieval.

For configuration syntax, parameter references, and language-specific setup instructions, refer directly to the Genkit Middleware documentation.

Implementing Human-in-the-Loop Safeguards

Moving an application from a stateless chatbot to a production-ready system capable of executing complex workflows requires strict operational boundaries. The Tool Approval middleware serves this exact purpose.

When you attach the Tool Approval interceptor to a destructive tool—such as a database deletion routine or a financial transaction API—the Genkit execution engine pauses right before the tool executes. The runtime waits for an external approval signal. If the human operator rejects the payload, the middleware throws an exception back to the model, allowing the agent to update its execution plan or halt entirely.

Tracing and Debugging Configuration

Because middleware relies on an “onion-style” stacking order, execution flow can become difficult to trace as you add more interceptors. Genkit provides visibility into this stack via the Genkit Dev UI.

When you register middleware in your application, the Dev UI automatically maps the configurations. You can inspect the execution trace through each hook layer, verifying exactly where a request was modified, cached, or rejected. This visual tracing is especially useful when evaluating and testing AI agents that utilize nested fallbacks and multi-step retries.

Start by installing the middleware package for your target language and testing the Tool Approval interceptor on your most sensitive tools to establish a baseline of security.

Get Insanely Good at AI

Get Insanely Good at AI

The book for developers who want to understand how AI actually works. LLMs, prompt engineering, RAG, AI agents, and production systems.

Keep Reading