capital

Senior DevOps/SRE Engineer

capital • PL
Python Remote

We are a leading trading platform that is ambitiously expanding to the four corners of the globe. Our top-rated products have won prestigious industry awards for their cutting-edge technology and seamless client experience. We deliver only the best, so we are always in search of the best people to join our ever-growing talented team.

We're looking for a Senior DevOps/SRE Engineer to join our DevOps team and take end-to-end ownership of our cloud and on-premise environments. You will be a key contributor to building scalable, reliable, and secure systems that power our trading platform at a global scale.

This is a hands-on role: you'll architect and operate cloud infrastructure, drive automation and observability excellence, build robust CI/CD pipelines, and help shape the engineering culture around reliability and operational best practices.

Responsibilities:

  • Design, deploy, and maintain scalable cloud infrastructure on AWS, ensuring high availability, performance, and security across all environments.
  • Own and evolve Kubernetes cluster management — including bare-metal deployments — and ensure reliable containerised workloads using Docker and Helm.
  • Build and maintain CI/CD pipelines using GitLab CI, incorporating GitOps principles with FluxCD or ArgoCD to streamline and automate delivery workflows.
  • Define and manage Infrastructure as Code using Terraform, ensuring all infrastructure changes are version-controlled, repeatable, and reviewed.
  • Lead monitoring and observability initiatives: implement and maintain dashboards, alerting, and log pipelines using VictoriaMetrics/Prometheus, Grafana, and the ELK stack.
  • Operate and optimize Apache Kafka ecosystems, including Strimzi, Kafka Connect, and MirrorMaker, to support real-time data pipelines.
  • Drive incident response, root cause analysis, and post-mortem culture to continuously improve system reliability.
  • Collaborate closely with Engineering, Security, and Product teams to embed DevOps best practices across the organisation.
  • Mentor and guide junior engineers, raising the overall engineering bar for infrastructure reliability and automation.
  • Requirements:

  • 6+ years of hands-on experience in a DevOps or SRE role.
  • Strong knowledge of AWS services, including: VPC, EC2, EKS, S3, ECR, EBS, RDS, ElastiCache, IAM, KMS, Secrets Manager, SSM Parameter Store, CloudWatch, MSK, SNS, SQS, Route 53, Direct Connect, Transit Gateway, and ELB/ALB/NLB.
  • Solid Linux administration skills with deep understanding of system internals.
  • Deep expertise in Kubernetes, including bare-metal cluster deployment and day-2 operations. Proficiency with Docker and Helm.
  • Hands-on experience with Terraform as a primary Infrastructure as Code tool — writing, reviewing, and maintaining production-grade modules.
  • Proven experience with GitLab CI for building and maintaining CI/CD pipelines; familiarity with GitOps practices using FluxCD or ArgoCD.
  • Strong background in monitoring and observability: VictoriaMetrics or Prometheus, Grafana, and the ELK stack; solid understanding of log collection and processing with Fluentbit, Fluentd, and Logstash.
  • Experience operating and managing Apache Kafka ecosystems, including Strimzi, Kafka Connect, and MirrorMaker.
  • Experience with Ansible for configuration management; experience with AWX is a plus.
  • Proficiency in scripting and automation with Bash, Python, and Go.
  • Strong communication skills and the ability to collaborate cross-functionally in a fast-paced, regulated environment.
  • English language proficiency.