Halter

Senior Machine Learning Backend/Infrastructure Engineer

Halter • NZ
Python
About the role

Machine learning infrastructure underpins all of our data products, and enables R&D on highly complex systems with the potential to unlock untapped value. We are looking for a Senior Machine Learning Infrastructure Engineer who can scale our market-leading behaviour models, enable the execution of scientific endeavors in a deep-learning dominant environment, and extend the way we apply machine learning to all areas of the business. We’re looking for people who are hungry to make an impact, are comfortable in a fast-paced environment, and love helping the people around them succeed.

We are looking for a Senior Machine Learning Infrastructure Engineer who can scale our market-leading behaviour models, grow our team of data scientists, and extend the way we apply machine learning to all areas of the business. We’re looking for people who are hungry to make an impact, are comfortable in a fast-paced environment, and love helping the people around them succeed.

What your day could look like

  • Design and maintain scalable ML pipelines for training, validation, and inference
  • Build and optimize model serving infrastructure with proper monitoring, logging, and alerting
  • Implement MLOps practices including automated testing, deployment, and rollback systems
  • Manage data pipelines and ensure data quality, lineage, and governance
  • Optimize model performance and resource utilization across different environments
  • Collaborate with data scientists to productionize models and research experiments
  • Technical Requirements:
  • Strong proficiency in cloud platforms (AWS, GCP, Azure) and containerization (Docker, Kubernetes)
  • Experience with ML frameworks (TensorFlow, PyTorch) and serving systems
  • Knowledge of orchestration tools
  • Proficiency in Python, SQL, and Infrastructure as Code (Terraform)
  • Experience with monitoring and observability tools
  • Understanding of CI/CD pipelines and version control systems
  • Infrastructure Skills:
  • Database management (both SQL and NoSQL) and data warehousing solutions
  • Stream processing and real-time data systems (Kafka, Spark Streaming)
  • Model registry and experiment tracking systems
  • Performance optimization and cost management in cloud environments
  • What we’re looking for

  • A comprehensive understanding of the fundamentals and design systems behind building reliable, scalable and fit-for-purpose machine learning models and infrastructure
  • Strong experience in Python and familiarity with development in a backend tech stack
  • A strong understanding of data engineering best practices and AWS infrastructure
  • A confident understanding of data engineering best practices and AWS infrastructure
  • Importantly, we are looking for someone who is passionate about what they do and eager to learn