Hugging Face Reports Chinese Open Models Overtook U.S. on Hub as Qwen and DeepSeek Drive Derivative Boom
Hugging Face's Spring 2026 report says Chinese open models now lead Hub adoption, with Qwen and DeepSeek powering a surge in derivatives.
Hugging Face’s March 17, 2026 ecosystem report, State of Open Source on Hugging Face: Spring 2026, says Chinese open models have overtaken U.S. models on recent Hub adoption, with China accounting for 41% of downloads over the past year in the blog’s framing. For developers, the bigger shift is structural: the Hub is increasingly driven by DeepSeek, Qwen, and a fast-growing layer of quantized, fine-tuned, and repackaged derivatives rather than only original base model labs.
Hugging Face ties that snapshot to platform scale in 2025, 11 million users, more than 2 million public models, and over 500,000 public datasets. The post also points to heavy concentration, with the top 200 models accounting for 49.6% of all downloads, while roughly half of all models have fewer than 200 downloads.
Download Share and Geographic Shift
The headline claim is that China has moved ahead of the U.S. in recent open-model adoption on the Hub. Hugging Face’s linked analysis is grounded in the paper Economies of Open Intelligence: Tracing Power & Participation in the Model Ecosystem, which analyzed 851,000 models, 200+ attributes per model, and 2.2 billion downloads from June 2020 to August 2025.
Using the paper’s recent-year methodology, Chinese developers captured 17.1% of recent downloads versus 15.8% for U.S. developers, with International/Online at 23.8%.
| Recent download share | Share |
|---|---|
| International/Online | 23.8% |
| China | 17.1% |
| USA | 15.8% |
For teams selecting open models, this matters because the most active ecosystems on Hugging Face are no longer clustered around the same set of Western labs that dominated the earlier open-weight cycle.
DeepSeek and Qwen as Ecosystem Drivers
Hugging Face points back to the January 2025 release of DeepSeek R1 as the turning point that accelerated Chinese open releases. The post says Baidu went from zero Hub releases in 2024 to more than 100 in 2025, while ByteDance and Tencent increased releases eight- to nine-fold.
The underlying paper shows how concentrated that shift became among a few developers.
| Developer | Recent download share |
|---|---|
| lmstudio-community | 16.4% |
| deepseek-ai | 9.6% |
| comfy | 5.4% |
| Qwen | 4.6% |
DeepSeek and Qwen matter as model families, but the more durable signal is downstream reuse. Hugging Face says Alibaba has more derivative models than Google and Meta combined, and the Qwen family accounts for more than 113,000 derivative models. Count all models tagged Qwen, and the total exceeds 200,000.
If you build on open models, this shifts evaluation work away from brand-level comparisons and toward lineage-level comparisons. Two artifacts derived from the same upstream family can differ materially in quantization, prompting behavior, hardware fit, and license terms. Your model registry and eval process need to capture that.
The Intermediary Layer Is Now Core Infrastructure
The paper describes an emergent developer intermediary layer made up of groups that quantize, merge, fine-tune, package, and redistribute models. Repositories such as lmstudio-community, comfy, and mlx-community are part of that pattern.
This is where deployment strategy changes. If you run local or edge inference, the winning model for your workload may come from a community repackager rather than the original publisher. Quantized variants, adapter merges, and hardware-specific builds now have enough download share to shape the market directly. For teams running local agents or desktop workflows, this connects closely to the practical tradeoffs in How to Run LLMs Locally on Your Machine.
The same dynamic affects retrieval and adaptation choices. If your use case needs domain behavior more than broad generality, the decision often lands between derivative fine-tunes and retrieval pipelines, which is the same boundary covered in Fine-Tuning vs RAG: When to Use Each Approach and What Is RAG? Retrieval-Augmented Generation Explained.
Openness Is Expanding, Transparency Is Falling
The Hugging Face post presents growth in open ecosystems, but the linked paper shows a narrower definition of openness is weakening. Models disclosing training-data information fell from 79.3% in 2022 to 39% in 2025. The paper also says open-weight models surpassed truly open-source models in 2025, using disclosure-based criteria aligned with the OSI framing.
| Transparency metric | 2022 | 2025 |
|---|---|---|
| Models with training-data disclosure | 79.3% | 39.0% |
For developers, this affects procurement, compliance, and reproducibility. If your stack depends on auditable provenance, data governance, or repeatable fine-tuning, “available on the Hub” is no longer enough as a selection filter. You need metadata checks alongside benchmark checks, similar to the broader discipline of How to Evaluate AI Output (LLM-as-Judge Explained), but applied to model sourcing and documentation.
Model Supply Is Getting Larger and More Specialized
The paper also quantifies the technical shape of this market. Average model size increased 17× from 2020 to 2025. Multimodal generation rose 3.4×, quantization increased 5×, and mixture-of-experts usage increased 7×.
Those numbers point to a more fragmented open-model landscape. Families like Qwen and DeepSeek supply the gravitational pull, but deployment-ready variants are increasingly optimized for narrow hardware and task constraints. If you maintain agent systems or coding workflows, that raises the value of explicit context, routing, and tool controls, which aligns with the engineering patterns in Context Engineering: The Most Important AI Skill in 2026.
If you rely on Hugging Face as your primary open-model discovery layer, update your selection process now: evaluate derivatives as first-class candidates, verify transparency metadata before adoption, and benchmark Chinese model families and community repackages against your existing defaults instead of treating them as edge options.
Get Insanely Good at AI
The book for developers who want to understand how AI actually works. LLMs, prompt engineering, RAG, AI agents, and production systems.
Keep Reading
How to Run IBM Granite 4.0 1B Speech for Multilingual Edge ASR and Translation
Learn how to deploy IBM Granite 4.0 1B Speech for fast multilingual ASR and translation on edge devices.
H Company Releases Holotron-12B Computer-Use Agent on Hugging Face
H Company released Holotron-12B, a Nemotron-based multimodal computer-use model touting higher throughput and 80.5% on WebVoyager.
How to Deploy Mistral Small 4 for Multimodal Reasoning and Coding
Learn how to deploy Mistral Small 4 with reasoning controls, multimodal input, and optimized serving on API, Hugging Face, or NVIDIA.
How to Get Started with Open-H, GR00T-H, and Cosmos-H for Healthcare Robotics Research
Learn how to use NVIDIA's new Open-H dataset and GR00T-H and Cosmos-H models to build and evaluate healthcare robotics systems.
How to Use Claude Across Excel and PowerPoint with Shared Context and Skills
Learn how to use Claude's shared Excel and PowerPoint context, Skills, and enterprise gateways for faster analyst workflows.