Ai Engineering 3 min read

XDOF Exits Stealth With $70M and 130K-Trajectory Robot Dataset

XDOF raised $70 million to build a three-tier physical data collection pipeline and co-released the massive ABC-130K manipulation dataset with UC Berkeley.

The robotics infrastructure startup XDOF has emerged from stealth, announcing a $70 million funding round to address the primary data bottleneck in physical AI. The company operates as an outsourced data factory, generating the high-fidelity physical interaction data required to train foundation models for robots.

Unlike text-based models that leverage internet-scale datasets, physical AI systems require precise manipulation records that cannot be scraped from the web. XDOF replaces the overhead of in-house robotic fleets with a centralized data collection pipeline. The platform already serves approximately 20 frontier AI labs building multimodal physical systems.

The Three-Tier Teleoperation Pipeline

XDOF captures physical movement data through a structured three-tier architecture. The first tier utilizes deployment-robot teleoperation in warehouse-scale environments, capturing precise joint and end-effector manipulation data directly on target hardware.

The second tier leverages the GELLO (General, Low-cost, Lo-fi) teleoperation interface. This framework allows human operators to guide robotic arms through complex tasks with low latency. If you deploy agents to robot hardware, you know that high-quality teleoperation logs are critical for behavioral cloning and offline reinforcement learning.

The third tier captures human movement through egocentric wearable sensors. This broadens the physical context available to the models, mapping everyday human interactions into formats compatible with robotic training pipelines.

The ABC-130K Dataset

Concurrent with the funding announcement, XDOF co-released a massive open-source dataset in collaboration with the UC Berkeley AI Research Lab (BAIR). The ABC-130K dataset contains 130,000 trajectories of bimanual robot operations.

The release includes over 300 hours of synchronized simulated and real-world data. Tasks range from folding shirts to precise insertion operations like loading AirPods cases. Training dual-arm systems requires extensive coordination data, a hurdle that previously limited benchmarks like the RoboChallenge evaluations. ABC-130K provides the sheer volume necessary to train robust spatial reasoning.

Infrastructure Economics and Scale

Thrive Capital led the $70 million round, with participation from Spark Capital, a16z, Lux Capital, and WndrCo. Based in San Mateo, XDOF was founded in October 2024 by Philipp Wu, Yide Shentu, and Nemo Jin. The company currently employs 60 people.

The startup positions itself as a necessary abstraction layer for the AI industry. Frontier labs are shifting focus back to robotics and face massive capital requirements to build and maintain physical testing facilities. By centralizing the collection process, XDOF allows these labs to treat physical training data as a service rather than a real estate and hardware management problem. This mirrors the shift toward hardware-agnostic dexterity models, decoupling the software intelligence from mechanical constraints.

If your team is building physical AI systems, the availability of large-scale open datasets like ABC-130K changes your baseline requirements. You can now pre-train manipulation models on existing high-quality trajectories before investing in custom teleoperation hardware for your specific hardware architecture.

Get Insanely Good at AI

Get Insanely Good at AI

The book for developers who want to understand how AI actually works. LLMs, prompt engineering, RAG, AI agents, and production systems.

Keep Reading