Everseen

Software Development Engineer III - ML Ops

Everseen • RS
Hybrid
Everseen: A leader in vision AI solutions for the world’s leading retailers.

What you'll do

  • Design and Development: 
  • Collaborate and drive progress with cross-functional teams to design and develop new features and functionalities.  
  • Ensure that the developed solutions meet project objectives and enhance user experience. 
  • Influence and Decision-Making:  
  • Have influence over the technology stack and internal technical improvements, contributing to strategic decision-making. 

  • Coding: 
  • Based on requirements and a longer-term product and feature strategy, design and implement reusable, testable, efficient, and elegant code. 
  • Ensure adherence to coding standards and best practices. 

  • Testing: 
  • Create, maintain, and run unit tests for new and existing applications and services. 
  • Aim to deliver defect-free and well-tested solutions.  

  • Data Analysis: 
  • Analyze and collect data from various sources such as log files, application stack traces, and thread dumps. 
  • Utilize data analysis to identify trends, patterns, and potential areas for improvement. Based on this, begin to implement changes. 

  • Continuous Integration and Continuous Deployment (CI/CD): 
  • Create and maintain CI/CD integration using various tools. 
  • Automate the build, test, and deployment processes to ensure efficiency and reliability.  

  • Integration of Third-Party Solutions: 
  • Research and propose third-party software solutions to optimize system performance. 
  • Expand product capabilities by integrating compatible third-party solutions. 
  • Monitor  update and tracking of  third-party solutions' compatibility with Everseen stack according to internal development guidelines  

  • Monitoring and Troubleshooting: 
  • Monitor production logs to identify and troubleshoot issues promptly. 
  • Ensure seamless operation and timely resolution of any anomalies to maintain system reliability. 

  • Documentation:  
  • Responsible for creating, reviewing, and maintaining high-quality technical documentation to ensure clarity, consistency, and knowledge sharing within the development team. 
  • Collaborate with

  • AI/ML Engineering team
  • Data Engineering team
  • Software Development Engineers
  • DevOps team
  • Product Managers
  • Security & Compliance Teams
  • Profile and Skills

  • 3-4+ years of work experience in either ML infrastructure, MLOps, or Platform Engineering 
  • Bachelors degree or equivalent focusing on the computer science field is preferred 
  • Excellent communication and collaboration skills.

  • Technical Skills: 
  • Experience in ML infrastructure, MLOps, or Platform Engineering.
  • Strong programming skills, with experience in Front-End development, in React and Angular
  • Understanding ML lifecycle, model versioning, and monitoring
  • Experience with back-end frameworks on top of NodeJS ( NestJS )
  • Hands-on experience with Kubernetes, Docker, and cloud services.
  • Experience with CI/CD tools (e.g., GitLab, Jenkins).
  • Excellent communication and collaboration skills.
  • Experience with Infrastructure as Code (e.g., Terraform). 

  • Experience with:
  • ML frameworks (e.g., TensorFlow, PyTorch)
  • GPU orchestration (e.g., NVIDIA GPU Operator, MIG),
  • Infrastructure as Code (e.g., Terraform).
  • Data engineering tools (e.g., Snowflake, Databricks, BigQuery, Airbyte, Kafka)
  • Familiarity with feature stores and model registries. Exposure to large-scale distributed systems and performance optimisation. 
  • Ability to work with Linux systems, including troubleshooting skills such as log investigations, performance testing, and connectivity investigation.
  • Possesses a deep understanding of technical concepts and terminology relevant to Everseen's products and services. 
  • Expert knowledge of advanced concepts like microservices and distributed systems, indicating an understanding of modern software development architectures. 
  • In-depth knowledge of Azure Kubernetes Services for container orchestration, Azure Blob Storage for data storage, and ElasticSearch for search and analytics. 
  • Ability to leverage cloud computing technologies and services for testing and validation purposes. 
  • In-depth knowledge of cloud security, scalability, and performance optimization principles. 
  • Excellent understanding of cloud computing technologies and services, including infrastructure as a service (IaaS), platform as a service (PaaS), and software as a service (SaaS). 
  • Broad understanding of the software engineering and architecture space, including knowledge of various programming languages, frameworks, techniques, and industry trends in AI. 
  • Additional Skills

  • Interest in Learning and Growth Mindset:  
  • Demonstrated interest in learning and a strong desire to expand knowledge in their respective field. 
  • Curiosity to explore new technologies, methodologies, and best practices to enhance skills and capabilities. 
  • Results-oriented attitude, with a drive to achieve objectives efficiently. 
  •  
  • Analytical and Problem-Solving Skills:  
  • Possesses strong analytical and problem-solving abilities, leveraging data to inform product decisions. This skill is essential for identifying market opportunities, optimising product features, and addressing challenges effectively.