Wirestock DaaS Platform Lands $23M for Ethical Multimodal Data
Wirestock raised $23 million to expand its data-as-a-service platform, supplying foundation model makers with ethically licensed images, video, and 3D assets.
On May 14, 2026, Wirestock announced a $23 million funding round led by Nava Ventures to scale its multimodal training data operations. Originally built as a tool for creators to distribute stock media, the company has transitioned into a data-as-a-service (DaaS) platform. Wirestock now supplies curated, legally clear training datasets to six of the largest foundation model makers.
The strategic step-up round brings total external funding to $26 million, with participation from Sandberg Bernthal Venture Partners, Formula VC, and I2BF Global Ventures. Wirestock currently reports a $40 million annual run rate (ARR) and has distributed $15 million in payouts to its network of 700,000 contributors.
Curated Multimodal Datasets
The platform provides labs with access to a repository of over 50 million creative assets, 10 million of which are explicitly licensed for artificial intelligence training. As researchers train multimodal models, the demand for diverse, high-resolution inputs has shifted away from indiscriminate web scraping toward structured pipelines.
Wirestock segments its training data into three primary categories:
| Asset Type | Content Examples | Primary AI Application |
|---|---|---|
| Image and Video | High-resolution photography, varied aspect ratios, real-world footage | Vision-language models, generative video |
| Design Assets | UI/UX kits, vector graphics, fonts, textures | Generative UI, structural design generation |
| Gaming and 3D | 3D models, animation rigs, spatial data | Physics simulation, spatial reasoning, world models |
API Delivery and Licensing
AI labs interface with Wirestock through RESTful endpoints rather than static bulk downloads. The API-first delivery system includes built-in metadata filters, allowing engineers to query assets based on resolution parameters, genre, style, and physical characteristics. This structural metadata accelerates the data curation phase of pretraining.
Licensing clarity operates as a core feature of the platform. Datasets are bound by Creative Commons or explicit commercial agreements. This protects labs from copyright litigation and contamination risks associated with unverified training sets.
Infrastructure Expansion
The new capital will fund the development of enterprise software designed for dataset definition. Labs will be able to collaborate directly with Wirestock on quality control and iterative data collection.
The company is also expanding its content ingestion pipeline to support more complex 3D formats and higher-resolution media. To increase the cultural diversity of the datasets, Wirestock plans to open new creator hubs in Asia, South America, and Africa.
If you build multimodal applications or fine-tune vision models, the shift toward API-driven, legally clear data platforms changes the procurement process. Factoring commercial data licensing into your training budget is now a standard requirement for production deployments.
Get Insanely Good at AI
The book for developers who want to understand how AI actually works. LLMs, prompt engineering, RAG, AI agents, and production systems.
Keep Reading
How to Scale PyTorch Training With AWS Building Blocks
Learn how to configure AWS infrastructure and Hugging Face tools to optimize large-scale foundation model pre-training and inference workflows.
$50M Series B Values Voice Infrastructure Provider Vapi at $500M
Vapi secured a $50 million Series B funding round at a $500 million valuation after Amazon Ring shifted its entire inbound call volume to the voice platform.
IBM MAMMAL Foundation Model Unifies Gene and Protein Analysis
IBM Research released MAMMAL, a unified 458-million parameter foundation model that processes genes, proteins, and molecules in a single shared framework.
Origin Lab Raises $8M for Game Engine Telemetry Marketplace
Origin Lab has secured $8 million in seed funding to launch a platform that converts raw video game engine data into licensed datasets for world model research.
Meta's TRIBE v2 Maps fMRI Responses Across 70,000 Voxels
Meta FAIR has released TRIBE v2, a trimodal foundation model that simulates high-resolution fMRI responses to media without requiring live brain scans.