Senior / Staff Software Engineer, ML Datasets & Data Pipelines
Waabi • CA
You Will..
- Design and implement data pipelines using real-world driving data and Waabi World (our high-fidelity simulator) to train and evaluate deep learning models.
- Optimize data formats, caching, and dataloading to drive highly efficient ML training and evaluation at scale.
- Improve data sampling and composition for deep data introspection to track model performance and uncover critical edge-case scenarios.
- Champion engineering excellence by writing high-quality, well-structured, and rigorously tested code.
- Help drive project roadmap planning, prioritization, and delivery.
Qualifications:
- BS or MS in Computer Science, Machine Learning, or a related technical field, with 4+ years of industry experience.
- Proficiency in Python and strong software engineering fundamentals, including experience with deep learning frameworks such as PyTorch, TensorFlow, or JAX.
- Hands-on experience building distributed ETL and data processing pipelines.
- Direct experience managing ML pipelines, including dataset management, dataloading, and optimization.
- Strong understanding of cloud job orchestration, monitoring, and instrumentation best practices.
- A collaborative, open-minded approach with a passion for tackling hard problems in autonomous technology and a strong willingness to mentor others.
Bonus Points:
- Experience with optimizing large scale distributed training pipelines and/or highly optimized ML inference pipelines.
- Experience with MapReduce (Apache Hadoop/Spark) or orchestration frameworks (Apache Airflow, Apache Beam, Google Cloud Dataflow, AWS Step Functions).
- Experience solving data challenges specific to autonomous driving.
- Familiarity with linear algebra (projections, transforms) and 3D geometry.
- Experience working with multimodal sensor data (e.g., LiDAR, RADAR, camera).