Chrome Brings Cross-Origin Model Caching to Transformers.js

On June 23, 2026, Hugging Face detailed an experimental integration of the Cross-Origin Storage API into Transformers.js. Authored alongside Google’s Chrome team, the update targets the massive redundant bandwidth and disk space consumption caused by standard browser cache isolation. By enabling global caching, web-based AI applications can bypass multi-megabyte downloads for previously fetched models and WebAssembly runtimes.

The Cache Isolation Bottleneck

Modern web browsers partition HTTP caches by origin to protect user privacy. If two different web applications both use Transformers.js to load the same model, such as Xenova/whisper-tiny.en, the browser is forced to download and store the model weights separately for each site.

This behavior creates severe inefficiencies for developers trying to run LLMs locally in the browser. Popular models occupy multiple times their actual size on the user’s disk. Users also endure redundant network requests for multi-megabyte WebAssembly (Wasm) runtime files every time they visit a new AI-powered website.

Global Storage via Content Hashing

The proposed Cross-Origin Storage (COS) API introduces a global storage area that operates independent of specific domains. Files are stored and retrieved based on their content hash rather than their URL.

Once a site caches a core dependency like ort-wasm-simd-threaded.asyncify.wasm (4,733 kB), any subsequent site requesting the same hash can pull it directly from the COS. This eliminates the network request entirely. Retrieval via hash identifiers like SHA-256 ensures implicit verification on write, preventing applications from loading poisoned or corrupted models.

Feature	Standard HTTP Cache	Cross-Origin Storage API
Scope	Partitioned by origin	Global across origins
Identification	URL	Content Hash (SHA-256)
Verification	Certificate trust	Implicit hash verification
Redundancy	High (duplicate downloads)	Low (shared resources)

Privacy Controls and Implementation

Global caching introduces the risk of cache probing, where a malicious site could check for specific cached models to infer a user’s browsing history. The COS API mitigates this using an origins field. Developers can restrict resource availability to a specific whitelist of origins or set the field to '*' for universally public models.

Hugging Face has implemented this capability in Transformers.js v4.2.0. Developers can activate the global cache check by setting a single library flag:

javascript env.experimental_useCrossOriginStorage = true;

When this flag is enabled, the library checks the COS for existing model weights before falling back to the standard Cache API or initiating a network fetch.

Browser Ecosystem Context

This experimentation builds on the foundation of Transformers.js v4.0.0, released in April 2026. That major update introduced a WebGPU runtime rewritten in C++ and enabled support for models exceeding 8 billion parameters, such as GPT-OSS 20B. While Hugging Face recently brought Transformers.js v4 to Chrome extensions for isolated local processing, the COS API solves resource sharing across the open web.

If you build heavy web-based AI applications, you can test this architecture today. The API is currently available in Chrome via an experimental flag and requires a dedicated Cross-Origin Storage Chrome extension for local development environments.

Chrome Brings Cross-Origin Model Caching to Transformers.js

The Cache Isolation Bottleneck

Global Storage via Content Hashing

Privacy Controls and Implementation

Browser Ecosystem Context

Keep Reading

How to Create and Use One-Click Skills in Google Chrome

Transformers.js v4, Now Inside Chrome Extensions

Multitask Seamlessly with Chrome’s New Split-Screen AI Mode

Ryzen 9000 BIOS Update Restores TSME for Consumer CPUs

Groq Lands $650M to Scale Neocloud Inference Infrastructure