1X's mission is to provide an abundant supply of physical labor via safe, intelligent androids. Our environments are designed for humans, so we design our hardware to take after the human form for maximum generality. To make the best use of this general-purpose hardware, we also pursue the maximally general approach to autonomy: learning motor behaviors end-to-end from vision using neural networks.
We deployed this system on EVE for patrolling tasks in 2023, and are now excited to share some of the new capabilities our androids have learned purely end-to-end from data:
Every behavior you see in the above video is controlled by a single vision-based neural network that emits actions at 10Hz. The neural network consumes images and emits actions to control the driving, the arms, gripper, torso, and head. The video contains no teleoperation, no computer graphics, no cuts, no video speedups, no scripted trajectory playback. It's all controlled via neural networks, all autonomous, all 1X speed.
To train the ML models that generate these behaviors, we have assembled a high-quality, diverse dataset of demonstrations across 30 EVE robots. We use that data to train a “base model” that understands a broad set of physical behaviors, from cleaning to tidying homes to picking up objects to interacting socially with humans and other robots. We then fine-tuned that model into a more specific family of capabilities (e.g. a model for general door manipulation and another for warehouse tasks) and then fine-tuned those models further to align the behavior with solving specific tasks (e.g. open this specific door). This strategy allows us to onboard new skills in just a few minutes of data collection and training on a desktop GPU.
All of the capabilities shown in the video were trained by our android operators. They represent a new generation of "Software 2.0 Engineers'' who express robot capabilities through data instead of writing code. Our ability to teach our robots short mobile manipulation skills is no longer constrained by the number of AI engineers, so this creates a lot of flexibility in what our androids can do for our customers.
If you find this work interesting, we’d like to call attention to two roles that we are hiring for to accelerate our mission toward general-purpose physically embodied intelligence:
Over the last year we’ve built out a data engine for solving general-purpose mobile manipulation tasks in a completely end-to-end manner. We’ve convinced ourselves that it works, so now we're hiring AI researchers in the SF Bay Area to scale it up to 10x as many robots and teleoperators. We're looking for experts in imitation learning, reinforcement learning, large-scale training, and skills relevant to scaling up deployments of autonomous vehicles. You'll be working in a fast-paced team of generalists that ship features to our fleet on a 24-hour release cycle. The work is a mix of pioneering new learning algorithms and fixing speed bottlenecks in our data flywheel. We are relentless in simplifying algorithms and infrastructure as much as possible.
We're also hiring android operators in both our Oslo and Mountain View offices to collect data, train models with that data, and evaluate those models. Unlike most data collection jobs, our teleoperators are empowered to train their own models to automate their own tasks and think deeply about how data maps to learned robot behavior. If you want to experience what it is like to live in a real-life "Westworld", we'd love for you to apply.
We also have other open roles across mechanical, electrical, and software disciplines that make the foundation possible to ship all of this cutting-edge ML technology. Follow 1x_tech on X for more updates, and join us in living in the future.