Robotics AI Startup XDOF Raises $70 Million to Solve Physical Data Bottleneck
Robotics infrastructure startup XDOF has secured $70 million in funding to address a critical shortage of high-quality data required for artificial intelligence to function in physical environments. The company, backed by investors including Andreessen Horowitz, Thrive Capital, and Spark Capital, simultaneously released “ABC-130K,” an open-source dataset described as the world’s largest for dual-arm robot manipulation.
Did You Know? The name “GELLO” refers to the original project developed by CEO Philippe Wang and his research team at UC Berkeley, which created a low-cost teleoperation system for precise robot arm data collection.
Why Physical AI Requires Specialized Data
While large language models rely on vast amounts of internet-based text, robots require high-precision data capturing how physical objects are grasped, moved, and manipulated. According to the company, existing sources like YouTube or factory footage often lack the spatial detail and movement accuracy necessary for advanced robotics. XDOF argues that the primary bottleneck in the field is not the model architecture or hardware, but the absence of a robust “data feedback loop” that allows robots to learn from physical interactions.
Infrastructure and the Data Pyramid
XDOF is moving beyond simple data collection by building a comprehensive infrastructure that includes data generation, cleaning, labeling, and re-learning systems. CEO Philippe Wang, who founded the company with CTO Fred Shentu and COO Nemo Jin, plans to implement a three-tier “data pyramid” to scale operations. This model ranges from custom teleoperated data at the top to generalized teleoperation and “egocentric” data—capturing daily human movements—at the base.

Expert Insight: The shift toward professionalized data infrastructure reflects a maturing robotics market. By outsourcing the labor-intensive process of data collection and parameter calibration, AI labs can focus on model development rather than the logistical complexities of physical-world training.
Future Outlook for XDOF
XDOF is currently scaling its operations, with a workforce exceeding 60 employees and a client base of approximately 20 firms, including AI research laboratories. To support its expansion, the company intends to hire a global team for teleoperation and data collection while developing proprietary wearable sensors to improve hand-tracking algorithms. As competition in the “physical AI” space intensifies—highlighted by OpenAI’s recent decision to restart its robotics program—XDOF is positioning itself as a vital service provider for companies struggling to bridge the gap between digital intelligence and physical execution.

Frequently Asked Questions
What is the ABC-130K dataset?
ABC-130K is an open-source dataset released by XDOF containing 130,000 trajectories, 300 hours of simulation data, and 100 hours of evaluation data focused on dual-arm robot manipulation.
How does XDOF collect its data?
The company utilizes a combination of specialized data pipelines, annotation systems, and teleoperation tools, including technology rooted in the GELLO project, which allows humans to remotely control robot arms to generate precise training data.
Who are the primary investors in XDOF?
The company’s $70 million funding round included participation from Thrive Capital, Spark Capital, Andreessen Horowitz, Lux Capital, and WonderCo.
As physical AI continues to evolve, do you believe the industry will rely more on standardized open-source datasets or highly customized, proprietary data for specialized tasks?