Ai Engineering 3 min read

Cloudflare Reinvents Cache to Shield Sites From AI Bots

With AI bot traffic set to surpass human usage by 2027, Cloudflare is deploying a dual-layer cache architecture to protect performance and origin servers.

On April 2, 2026, Cloudflare detailed a fundamental architectural shift in its edge network to manage the exponential growth of agentic traffic. Outlined in their rethinking cache for the AI era research, the company now handles over 10 billion automated requests per week. For developers managing high-traffic domains or building applications that rely on scraping, the way edge networks serve content is structurally changing.

The Traffic Asymmetry

Automated requests now account for 32% of all traffic across Cloudflare’s infrastructure. Within that segment, AI crawlers generate 80% of self-identified bot traffic. Traditional caching relies on hit rates driven by human behavior, where many users request the same popular assets. Web browsers also utilize local session management and side-caching to reduce server round-trips.

Machine clients bypass these mechanisms. They execute sequential, parallel scans of rarely visited pages. Because every request hits the CDN or the origin directly, these patterns actively evict popular human-facing content from edge nodes. The load disparity is massive. A human might visit five pages to complete a task, while autonomous AI agents request up to 5,000 sites to execute the same logic.

This scaling factor has tangible infrastructure costs. Wikimedia recently recorded a 50% surge in multimedia bandwidth driven entirely by bulk scraping. Platforms like Fedora and Diaspora experienced severe performance degradation for human users due to these parallel loads. Cloudflare projects that total AI bot traffic will eclipse human web usage by 2027.

Dual-Layer Cache Architecture

To protect origin servers without breaking agentic workflows, Cloudflare partnered with ETH Zurich to design a multi-tiered cache system. The architecture uses real-time machine learning algorithms to identify automated requests and route them away from standard delivery nodes.

The human tier remains on standard CDN Points of Presence (PoPs). This layer is strictly optimized for responsiveness and high cache hit rates. The AI tier operates as a separate infrastructure layer built for raw capacity. These specific caches tolerate higher latency, which is acceptable for asynchronous training data collection or retrieval-augmented generation pipelines. The network categorizes requests dynamically, routing workloads based on their identified purpose.

Industry analysts at WWT recently noted that this shift toward specialized high-performance architectures is necessary to handle agentic data mobility. Competitors like Bifrost are already attempting to capture this traffic by offering low-latency alternative networks that avoid managed proxy overhead.

New Edge Controls

The architectural split introduces specific infrastructure controls for site operators. Cloudflare implemented a specialized toolkit to manage automated access directly at the edge. The system includes a Pay Per Crawl feature integrated with Stripe, allowing domains to charge AI companies directly for data scraping.

Content delivery is also adapting to machine reading. Operators can deploy Markdown for Agents, serving a stripped-down, reduced-bandwidth version of a site when an automated crawler is detected. Administrators manage these policies through AI Crawl Control, which provides analytics and one-click blocking capabilities. This integrates with the existing AI Gateway for unified monitoring of AI applications and LLM provider rate-limiting.

If you operate heavily scraped domains or maintain web-crawling infrastructure, traditional cache hit rates will no longer reflect your actual origin load. Audit your server metrics specifically for sequential scans and implement explicit machine-readable endpoints to avoid aggressive throttling at the edge.

Get Insanely Good at AI

Get Insanely Good at AI

The book for developers who want to understand how AI actually works. LLMs, prompt engineering, RAG, AI agents, and production systems.

Keep Reading