Do you get excited about scaling machine learning systems in production?
Are you ready to work on cutting-edge Computer Vision technology?
Do you feel tired of working in a big company?
Would you like to cooperate with top professionals in our industry?
If your answers are mostly yes, then you should keep reading. At Nomagic, we're on a mission to teach robots the real world.
We're now looking for a Senior DevOps Engineer to own and scale the infrastructure behind our Computer Vision ML Cloud Service.
Offer essentials:
Work on cloud-native ML infrastructure at scaleSalary: 20,000 - 30,000 gross UoP per monthEquity for every employeeRelocation packageNo late evening calls - the entire eng team is based in Europe :)English-speaking environmentWe work mainly remotely, but you have to reckon with occasional office visits if the task requires it
Here is why we love this job ourselves, and hope you will enjoy it too:
We get to be creativeWe're still pretty small, so everyone has a direct impact on the final resultNothing is written in stone, we can easily change the technology we use (if the requirements change)The CEO and part of the management are experienced infrastructure engineers who created the foundations of Google Cloud PlatformWe combine world-class research with top-notch engineering and apply it to solve real problems
Some of the problems you may try to solve with us:
Scaling from single digit clients to 100+ in the future - designing multi-tenant infrastructure that scales": single-digit clients źle brzmi... może samo "scaling infrastructure to hundreds of B2B customersBuilding standardized, repeatable deployment patterns across multiple customer environmentsOptimizing data pipelines for Computer Vision model training and deploymentBuilding robust monitoring and alerting systems for ML service health and performance across all environmentsAutomated CI/CD pipelines for ML model deployment with safe rollback strategiesDeploying and managing applications on Kubernetes clusters using Helm charts and ArgoCD for GitOps workflowsInfrastructure as Code using Terraform to provision and manage cloud resources consistentlyImplementing comprehensive observability with Prometheus and Grafana across multiple tenantsEnsuring high availability, security, and cost optimization of cloud servicesOccasional hardware integration tasks (nice to have, not primary focus)
What skills we'd like you to have:
3+ years of experience as a DevOps or Infrastructure Engineer or in a similar role2+ years of experience in software developmentStrong Kubernetes experience - managing production clusters, helm charts, and GitOps (Like ArgoCD)Infrastructure as Code expertise with TerraformMonitoring and alerting - hands-on experience with Prometheus, Grafana, and building effective alerting strategiesDeep knowledge about Docker and container orchestrationExperience with CI/CD pipelines and automated deploymentsGood understanding of networking and security in cloud environmentsStrong proficiency in PythonExperience with one of the major cloud providers (preferably Google Cloud Platform)Experience designing multi-tenant infrastructure is a big plusExperience with ML infrastructure or data-intensive applications is a plusHardware experience (servers, edge devices) is nice to have but not requiredFluent communication in English
What should you expect once you apply?
30 minutes call with a Recruiter45 minutes Hiring Manager interview60 minutes Technical/Coding InterviewOnsites - half a day of interviews & discussions at the office