Ai Engineering 3 min read

$8.2M Seed Backs Human Archive's Gig-Worker Robotics Dataset

Human Archive has raised $8.2 million to build a multimodal robotics dataset by paying Indian gig workers $1 per hour to record physical service tasks.

On May 26, 2026, a startup founded by researchers from BAIR and SAIL launched a major initiative to solve the robotics data bottleneck. Human Archive raised $8.2 million in seed funding to build a multimodal dataset for physical AI. The Y Combinator-backed company leverages India’s gig economy, paying workers to wear sensor suites while performing everyday service tasks. The round was led by Wing Venture Capital and NVP Capital, with angel investments from figures at OpenAI, Nvidia, Google, and Meta.

Hardware and Data Pipeline

To bridge the sim-to-real gap where robots fail in physical environments, Human Archive captures synchronized RGB-D video, audio, and IMU data. The system relies on custom hardware, including camera-equipped caps, wrist cameras, tactile gloves for recording force feedback, and full-body motion capture suits.

The company currently operates over 1,000 active headset units across India. This hardware pipeline captures up to 8,000 hours of data per day. Human Archive has signed partnerships to scale its contributor network to 50,000 people. This physical volume is critical when training multimodal models for real-world navigation tasks.

Operations and Compensation

Human Archive partners with local service sectors like home cleaning, hospitality, and cloud kitchens to record workers washing dishes and sorting objects. Consumers booking services through partnered apps can opt into being recorded in exchange for a service discount.

MetricHuman Archive ModelIndia Data Industry Average
Base Worker Pay$1.00 / hour (₹83 INR)$3.00 - $4.80 / hour (₹250 - ₹400 INR)
Data ModalitiesRGB-D, audio, IMU, tactileMostly text, image, bounding boxes
Target EnvironmentsHomes, retail, kitchens, industrialDigital platforms
Active Deployments1,000+ headset unitsN/A

Regulatory and Industry Pushback

The collection of egocentric data inside private homes has triggered immediate regulatory scrutiny. India’s Ministry of Electronics and Information Technology (MeitY) is examining the company’s consent mechanisms under the Digital Personal Data Protection (DPDP) Act.

Major Indian gig platforms, including Urban Company and Pronto, have explicitly declined to partner with Human Archive over privacy concerns. Urban Company CEO Abhiraj Singh Bhal confirmed they would not enter such agreements. This resistance highlights the friction in scaling ethical multimodal data collection for robotics.

If your team requires physical AI data, leveraging the gig economy offers massive scale but introduces immediate compliance constraints. Sourcing first-person data inside private environments requires explicit, auditable consent structures before regulatory bodies restrict the datasets as non-compliant.

Get Insanely Good at AI

Get Insanely Good at AI

The book for developers who want to understand how AI actually works. LLMs, prompt engineering, RAG, AI agents, and production systems.

Keep Reading