Chrome Brings Cross-Origin Model Caching to Transformers.js
Hugging Face and Google Chrome are testing a Cross-Origin Storage API in Transformers.js to cache large AI models globally across different web domains.
On June 23, 2026, Hugging Face detailed an experimental integration of the Cross-Origin Storage API into Transformers.js. Authored alongside Google’s Chrome team, the update targets the massive redundant bandwidth and disk space consumption caused by standard browser cache isolation. By enabling global caching, web-based AI applications can bypass multi-megabyte downloads for previously fetched models and WebAssembly runtimes.
The Cache Isolation Bottleneck
Modern web browsers partition HTTP caches by origin to protect user privacy. If two different web applications both use Transformers.js to load the same model, such as Xenova/whisper-tiny.en, the browser is forced to download and store the model weights separately for each site.
This behavior creates severe inefficiencies for developers trying to run LLMs locally in the browser. Popular models occupy multiple times their actual size on the user’s disk. Users also endure redundant network requests for multi-megabyte WebAssembly (Wasm) runtime files every time they visit a new AI-powered website.
Global Storage via Content Hashing
The proposed Cross-Origin Storage (COS) API introduces a global storage area that operates independent of specific domains. Files are stored and retrieved based on their content hash rather than their URL.
Once a site caches a core dependency like ort-wasm-simd-threaded.asyncify.wasm (4,733 kB), any subsequent site requesting the same hash can pull it directly from the COS. This eliminates the network request entirely. Retrieval via hash identifiers like SHA-256 ensures implicit verification on write, preventing applications from loading poisoned or corrupted models.
| Feature | Standard HTTP Cache | Cross-Origin Storage API |
|---|---|---|
| Scope | Partitioned by origin | Global across origins |
| Identification | URL | Content Hash (SHA-256) |
| Verification | Certificate trust | Implicit hash verification |
| Redundancy | High (duplicate downloads) | Low (shared resources) |
Privacy Controls and Implementation
Global caching introduces the risk of cache probing, where a malicious site could check for specific cached models to infer a user’s browsing history. The COS API mitigates this using an origins field. Developers can restrict resource availability to a specific whitelist of origins or set the field to '*' for universally public models.
Hugging Face has implemented this capability in Transformers.js v4.2.0. Developers can activate the global cache check by setting a single library flag:
javascript env.experimental_useCrossOriginStorage = true;
When this flag is enabled, the library checks the COS for existing model weights before falling back to the standard Cache API or initiating a network fetch.
Browser Ecosystem Context
This experimentation builds on the foundation of Transformers.js v4.0.0, released in April 2026. That major update introduced a WebGPU runtime rewritten in C++ and enabled support for models exceeding 8 billion parameters, such as GPT-OSS 20B. While Hugging Face recently brought Transformers.js v4 to Chrome extensions for isolated local processing, the COS API solves resource sharing across the open web.
If you build heavy web-based AI applications, you can test this architecture today. The API is currently available in Chrome via an experimental flag and requires a dedicated Cross-Origin Storage Chrome extension for local development environments.
Get Insanely Good at AI
The book for developers who want to understand how AI actually works. LLMs, prompt engineering, RAG, AI agents, and production systems.
Keep Reading
How to Create and Use One-Click Skills in Google Chrome
Convert your favorite Gemini AI prompts into automated browser macros with Google's new Skills feature for one-click productivity on any webpage.
Transformers.js v4, Now Inside Chrome Extensions
Hugging Face has published an integration guide for running Transformers.js v4 and the 500MB Gemma 4 E2B model locally inside Manifest V3 Chrome extensions.
Multitask Seamlessly with Chrome’s New Split-Screen AI Mode
Google’s latest Chrome update introduces AI Mode, featuring a split-screen interface and multi-tab bundling to streamline complex research and shopping.
Ryzen 9000 BIOS Update Restores TSME for Consumer CPUs
AMD will reverse its controversial AGESA 1.2.7.0 firmware change and reinstate Transparent Secure Memory Encryption for non-PRO Ryzen 9000-series processors.
Groq Lands $650M to Scale Neocloud Inference Infrastructure
Following a $20 billion IP deal with Nvidia that drained its founding team, Groq has raised $650 million to rebuild as a dedicated inference cloud provider.