Engineering Manager, ML Training Platform

Remote Full-time
Zoox is on a mission to reimagine transportation and ground-up build autonomous robotaxis that are safe, reliable, clean, and enjoyable for everyone. We are still in the early stages of deploying our robotaxis on public roads, and it is a great time to join Zoox and have a significant impact in executing this mission. The ML Platform team at Zoox plays a crucial role in enabling innovations in ML and CV to make autonomous driving as seamless as possible. The Opportunity Are you excited to manage our ML Training Platform that enables autonomous driving? You will get to work across all ML teams within Zoox - Perception, Prediction, Planner, Simulation, Collision Avoidance, Data Science, etc., and have the opportunity to significantly push the boundaries of how ML is practiced within Zoox. This team builds and operates the core part of the ML platform that powers model training at scale. We are responsible for developing and operating ML tools, deep learning frameworks, and distributed model training infrastructure to support foundational models and reinforcement learning. This team also owns the model repository and model lifecycle management tools used by our applied research teams for in- and off-vehicle ML use cases. You will lead a team of strong software engineers and act as a force multiplier for our internal customers. This team has a lot of growth opportunities as we expand our robotaxi deployments and venture into new ML domains. If you want to learn more about our stack behind autonomous driving, please look here. In this role, you will: • Vision: Develop and execute a strategic vision for our ML training platform, ensuring scalability, reliability, and performance to support large-scale Foundation and RL models. • Technical acumen: Lead the design, implementation, and operation of a robust and efficient ML training platform to enable the training, experimentation, validation, and monitoring of ML models. • Hiring: Attract, hire, and inspire a diverse world-class engineering team, fostering a culture of innovation, collaboration, and excellence. • Partnership: Collaborate closely with cross-functional teams, including ML researchers, software engineers, data engineers, and hardware engineers to define requirements and align on architectural decisions. • Mentorship: Enable the engineers in the team to grow their careers by providing the right opportunities along with clear and timely feedback. Qualifications • 8+ years of total experience, including 3+ years of engineering management experience. • Excellent leadership skills with a demonstrated ability to build and manage high-performing engineering teams. • Experience enabling large-scale, cost-efficient distributed model training and ML compute infrastructure. • Experience with training frameworks such as PyTorch, Hugging Face, Ray, DeepSpeed, JAX, etc., leveraging GPUs, TPUs, or Trainium. • Experience building model lifecycle management tools and managing AWS costs for our ML needs. About Zoox Zoox is developing the first ground-up, fully autonomous vehicle fleet and the supporting ecosystem required to bring this technology to market. Sitting at the intersection of robotics, machine learning, and design, Zoox aims to provide the next generation of mobility-as-a-service in urban environments. We’re looking for top talent that shares our passion and wants to be part of a fast-moving and highly execution-oriented team. Follow us on LinkedIn Accommodations If you need an accommodation to participate in the application or interview process please reach out to [email protected] or your assigned recruiter. A Final Note: You do not need to match every listed expectation to apply for this position. Here at Zoox, we know that diverse perspectives foster the innovation we need to be successful, and we are committed to building a team that encompasses a variety of backgrounds, experiences, and skills. Apply tot his job
Apply Now

Similar Opportunities

Machine Learning Engineering Manager - Asset Intelligence

Remote Full-time

Senior Backend Engineer, MLOps (Deploy)

Remote Full-time

Senior MLOps Engineer, GenAI Framework

Remote Full-time

ML/Ops Engineer with strong Azure cloud experience Remote Position Duration: 12+ months Role Overv

Remote Full-time

Software Engineer Manager, Pricing and MLOps (Remote)

Remote Full-time

Senior ML Ops Engineer - Hybrid

Remote Full-time

Senior MLOps Engineer, vLLM Inference

Remote Full-time

Remote iOS App Developer Jobs | Work From Home Jobs

Remote Full-time

Sr. Mobile Application Developer

Remote Full-time

Sr. Product Manager - Mobile AI (Remote)

Remote Full-time

Director, Portfolio Finance

Remote Full-time

**Experienced Logistics Data Entry Analyst (Typist) – Remote Opportunity at arenaflex**

Remote Full-time

Experienced Full Stack Customer Service Representative – Remote Work Opportunities with blithequark

Remote Full-time

Community Leasing Associate job at Fairfield Residential in Denver, CO

Remote Full-time

Software Engineer, Embedded Hardware

Remote Full-time

Experienced Remote Customer Service Specialist – Delivering Exceptional Support and Building Strong Relationships with Clients at blithequark

Remote Full-time

**Experienced Content Creator – Digital Storytelling for blithequark at Home**

Remote Full-time

Associate Education Consultant

Remote Full-time

[Remote] Human Resources Leadership Consultant (Remote | Self-Employed)

Remote Full-time

Data Entry Specialist – Entry‑Level Logistics Data Management Role with UPS (Remote, Immediate Start, No Experience Required)

Remote Full-time
← Back to Home