Senior / Staff Software Engineer, ML Datasets & Data Pipelines

Waabi · CA · 115d ago

Hybrid Python

About the role

Waabi, founded by AI visionary Raquel Urtasun, is the leader in Physical AI. With a world-class team, we're unlocking the next era of autonomous transportation with technology that's powering commercial autonomous trucks and robotaxis. Waabi is backed by and partners with world leaders in AI, automotive, logistics, and deep tech.

With offices in Toronto, San Francisco, Dallas, and Pittsburgh, Waabi is growing quickly and looking for diverse, innovative and collaborative candidates who want to impact the world in a positive way. To learn more visit: www.waabi.ai

As a Senior/Staff Software Engineer embedded within our Autonomy & Algorithms team, you will build the scalable ML data pipelines necessary to train and evaluate Waabi’s autonomous driving platform. Working closely with world-renowned scientists and engineers, you will solve complex data challenges to accelerate our launch of fully driverless vehicles.

You Will..

- Design and implement data pipelines using real-world driving data and Waabi World (our high-fidelity simulator) to train and evaluate deep learning models.

- Optimize data formats, caching, and dataloading to drive highly efficient ML training and evaluation at scale.

- Improve data sampling and composition for deep data introspection to track model performance and uncover critical edge-case scenarios.

- Champion engineering excellence by writing high-quality, well-structured, and rigorously tested code.

- Help drive project roadmap planning, prioritization, and delivery.

Qualifications:

- BS or MS in Computer Science, Machine Learning, or a related technical field, with 4+ years of industry experience.

- Proficiency in Python and strong software engineering fundamentals, including experience with deep learning frameworks such as PyTorch, TensorFlow, or JAX.

- Hands-on experience building distributed ETL and data processing pipelines.

- Direct experience managing ML pipelines, including dataset management, dataloading, and optimization.

- Strong understanding of cloud job orchestration, monitoring, and instrumentation best practices.

- A collaborative, open-minded approach with a passion for tackling hard problems in autonomous technology and a strong willingness to mentor others.

Bonus Points:

- Experience with optimizing large scale distributed training pipelines and/or highly optimized ML inference pipelines.

- Experience with MapReduce (Apache Hadoop/Spark) or orchestration frameworks (Apache Airflow, Apache Beam, Google Cloud Dataflow, AWS Step Functions).

- Experience solving data challenges specific to autonomous driving.

- Familiarity with linear algebra (projections, transforms) and 3D geometry.

- Experience working with multimodal sensor data (e.g., LiDAR, RADAR, camera).

Tech stack

Python

Arrangement Hybrid

Location CA

Posted 115d ago

findatechjob

Tech jobs straight from company career pages. No recruiters, no middlemen, no spam.

Countries

United States United Kingdom Germany Canada

Languages

Python TypeScript Go Rust

Company

About