ShopBack

Senior Site Reliability Engineer

ShopBack • CN
GoPython

Our Journey

The ShopBack Group is Asia-Pacific’s leading shopping, rewards, and payments platform, serving over 60 million shoppers across 13 markets. In 2025, the Group continued its global growth with its expansion into North America. Driven by the vision to make every day more rewarding, ShopBack is dedicated to saving members money and time, and delivering delight every day. The platform also enables merchants and brands to engage with their members in a cost-effective manner. Founded in 2014, ShopBack now powers over US$5.5 billion in annual sales for over 20,000 online and in-store partners, and has rewarded shoppers with more than US$800 million (over S$1 billion) in Cashback to date. Through its innovative offerings, ShopBack continues to create value for both members and merchants. Notably, its payment solution, ShopBack Pay, offers members a convenient and rewarding payment option at checkout.

About the role
 
At ShopBack, our engineering teams build scalable platforms and utilize leading-edge technologies to build a world-class product. You will join a diverse and talented team of aspiring engineers with great ambitions to impact the eCommerce landscape. We are seeking team members who strive to solve the hard problems, take pride in delivering world-class products, and are strong team players.
 
You are someone who wants to see the impact of your work making a difference every day. You find passion in the craft and are constantly seeking improvement and better ways to solve tough problems.

Your Adventure Ahead

  • Architect, build, and evolve highly scalable, resilient, and secure platforms that power mission-critical systems
  • Design and implement next-generation CI/CD systems that enable rapid, reliable, and zero-downtime deployments
  • Develop internal platforms and tooling that empower engineering teams and elevate developer productivity and experience
  • Proactively shape capacity strategy, performance engineering, and resilience planning to support future growth
  • Own and advance a region-partitioned, multi-tenant platform architecture, optimizing for performance, availability, and efficiency
  • Collaborate with product and engineering teams to enhance platform capabilities and improve developer experience
  • Leverage AI-driven and agentic workflows to automate infrastructure operations, intelligent remediation, and system optimization
  • Lead system reliability initiatives, using incidents as opportunities to design out failure and improve system design
  • Contribute to a strong engineering culture through thought leadership, experimentation, and continuous improvement
  • Participate in a shared on-call rotation, with a focus on building systems that minimise operational overhead over time
  • Essentials to Succeed

  • 7+ years of relevant DevOps or SRE experience
  • Proven experience working on public cloud platforms (AWS, GCP, etc)
  • Proven experience with containerization using Docker and Kubernetes
  • Experience with Infra-As-Code using Terraform is a plus
  • Experience with application development in Python, Javascript or Go is a plus.
  • Leverage AI on a day to day to improve individual and team workflows
  • Technologies We Use & Love

  • Cloud: AWS
  • Infra: Kubernetes
  • Programming languages: NodeJS / Typescript / Python / Go
  • Relational database: Postgres
  • Cache: Redis
  • Message queue: Kafka, SQS
  • Continuous Integration:  Gitlab, Fluxcd
  • Monitoring: DataDog / Prometheus
  • Networking: Istio
  • Big Data: Trino, Spark, S3, etc. 
  • Communication: Slack
  • Project Management: JIRA / Confluence
  • Other technologies: Knative Eventing / Serving, Debezium + Kafka Connect. Opensearch