Cloudflare AI Gateway Ships Granular Dollar Spend Limits
The new spend limits feature lets organizations set hard dollar caps across multiple LLM providers and automatically route traffic to cheaper fallback models.
Cloudflare has introduced real-time spend limits for AI Gateway, giving developers the ability to set hard dollar-based budget caps across multiple model providers. The release shifts the platform’s focus from traditional volume-based rate limiting to direct fiscal control over token consumption.
Organizations can now track cumulative spending across OpenAI, Anthropic, and Google from a single control plane. The system calculates the cost of each request in real time based on current model pricing, updating a live analytics dashboard filtered by provider and custom metadata.
Granular Budget Configurations
Administrators can scope budgets using specific dimensions rather than applying a single global cap. Limits can target high-cost models like GPT-5.5 or Claude 4.7, restrict total expenditure with a single provider, or apply to custom attributes like specific development environments and applications.
The feature supports both fixed and rolling time windows. Fixed windows can reset on the first of the month, every Monday, or at midnight. Rolling windows support daily, weekly, or monthly tracking intervals.
Enforcement and Dynamic Routing
When a budget threshold is reached, AI Gateway defaults to blocking further requests. Developers building applications with strict uptime requirements can configure dynamic routes instead.
Dynamic routing automatically shifts traffic to cheaper fallback models once a specific budget limit is hit. This ensures service continuity for users while protecting the organization’s budget. It provides an automated mechanism to reduce LLM API costs in production without writing custom fallback logic into the application layer.
Identity-Driven Controls
The spend limits feature natively integrates with Cloudflare Access to enforce budgets at the individual user level. When an employee authenticates, the system extracts their identity from the JSON Web Token.
This identity is attached as metadata to every AI Gateway request. It allows companies to track exactly who generated specific token costs and automatically halt usage for compromised credentials or runaway scripts. This mitigates the blast radius of leaked API keys and provides an infrastructure layer to secure AI agents operating in corporate environments.
Availability and Billing
The feature is currently in open beta and available to all AI Gateway users across Free, Pro, and Enterprise tiers. Limits are configured via the Cloudflare dashboard or the AI Gateway API. The system supports both Unified Billing and Bring Your Own Key requests for models with public pricing structures.
If you manage multi-provider AI workloads, evaluate migrating your application-side budget logic to the gateway layer to centralize your cost enforcement and automate fallback routing.
Get Insanely Good at AI
The book for developers who want to understand how AI actually works. LLMs, prompt engineering, RAG, AI agents, and production systems.
Keep Reading
How to Provision Google Colab GPUs From the Command Line
Learn how to install the Google Colab CLI, provision high-performance remote GPUs from your local terminal, and execute headless machine learning workflows.
Claude Platform Goes GA on AWS With Native API Parity
Anthropic has launched the Claude Platform on AWS in general availability, granting developers native API parity directly within their AWS environments.
Amazon Bedrock Gains GPT-5.5 and Codex in $50B OpenAI Deal
Following the end of Microsoft's exclusive distribution rights, Amazon Web Services has introduced OpenAI's GPT-5.5 and Codex models to the Bedrock platform.
CVE-2026-42208: Pre-Auth SQLi Actively Exploited in LiteLLM
Threat actors are exploiting a critical pre-authentication SQL injection in the LiteLLM proxy to exfiltrate master API keys and cloud provider credentials.
Cohere Acquires Aleph Alpha in $20B Sovereign AI Merger
Cohere is acquiring German AI firm Aleph Alpha to create a $20 billion transatlantic entity focused on sovereign AI for regulated European enterprises.