Palo Alto, California

Research Engineer, Scaling

Target start date: Immediately. Relocation provided.

Since its founding in 2015, 1X has been at the forefront of developing advanced humanoid robots designed for household use. Our mission is to create an abundant supply of labor via safe, intelligent humanoids. At 1X, you'll own critical projects, tackle unsolved research problems, deliver great products to customers, and be rewarded based on merit and achievement.

As a Research Engineer, Scaling, you'll build the systems that let every team and every robot go faster: training more often, evaluating more reliably, and deploying better models to our growing fleet. You'll transform prototypes into production-scale infrastructure for learning and inference, enabling larger training runs and maximizing edge compute utilization to make our models more capable.

Tech Stack

  • Linux

  • Python / C++

  • PyTorch / TorchTitan / TensorRT

  • Triton / CUDA

Location

The role is based in Palo Alto, CA. Candidates are expected to be in-person at the office.

Responsibilities

  • High agency and ownership on scaling capabilities in distributed training and/or inference

  • Ensure that compute is never the bottleneck, i.e. we always have more compute available than data

  • Enable large-scale (1000+ GPU) training on billion frames+ of robot data, from fault tolerance to distributed ops to experiment management

  • Optimize high-throughput datacenter scale distributed inference for world models: work on the world's fastest diffusion inference engine

  • Improve low-latency on-device inference for a variety of robot policies with quantization, scheduling, distillation and more

Requirements

Apply Now