Senior Site Reliability Engineer

GoPython

Our Journey

The ShopBack Group is Asia-Pacific’s leading shopping, rewards, and payments platform, serving over 60 million shoppers across 13 markets. In 2025, the Group continued its global growth with its expansion into North America. Driven by the vision to make every day more rewarding, ShopBack is dedicated to saving members money and time, and delivering delight every day. The platform also enables merchants and brands to engage with their members in a cost-effective manner. Founded in 2014, ShopBack now powers over US$5.5 billion in annual sales for over 20,000 online and in-store partners, and has rewarded shoppers with more than US$800 million (over S$1 billion) in Cashback to date. Through its innovative offerings, ShopBack continues to create value for both members and merchants. Notably, its payment solution, ShopBack Pay, offers members a convenient and rewarding payment option at checkout.

About the role

At ShopBack, our engineering teams build scalable platforms and utilize leading-edge technologies to build a world-class product. You will join a diverse and talented team of aspiring engineers with great ambitions to impact the eCommerce landscape. We are seeking team members who strive to solve the hard problems, take pride in delivering world-class products, and are strong team players.

You are someone who wants to see the impact of your work making a difference every day. You find passion in the craft and are constantly seeking improvement and better ways to solve tough problems.

Your Adventure Ahead

Architect, build, and evolve highly scalable, resilient, and secure platforms that power mission-critical systems

Design and implement next-generation CI/CD systems that enable rapid, reliable, and zero-downtime deployments

Develop internal platforms and tooling that empower engineering teams and elevate developer productivity and experience

Proactively shape capacity strategy, performance engineering, and resilience planning to support future growth

Own and advance a region-partitioned, multi-tenant platform architecture, optimizing for performance, availability, and efficiency

Collaborate with product and engineering teams to enhance platform capabilities and improve developer experience

Leverage AI-driven and agentic workflows to automate infrastructure operations, intelligent remediation, and system optimization

Lead system reliability initiatives, using incidents as opportunities to design out failure and improve system design

Contribute to a strong engineering culture through thought leadership, experimentation, and continuous improvement

Participate in a shared on-call rotation, with a focus on building systems that minimise operational overhead over time

Essentials to Succeed

7+ years of relevant DevOps or SRE experience

Proven experience working on public cloud platforms (AWS, GCP, etc)

Proven experience with containerization using Docker and Kubernetes

Experience with Infra-As-Code using Terraform is a plus

Experience with application development in Python, Javascript or Go is a plus.

Leverage AI on a day to day to improve individual and team workflows

Technologies We Use & Love

Cloud: AWS

Infra: Kubernetes

Programming languages: NodeJS / Typescript / Python / Go

Relational database: Postgres

Cache: Redis

Message queue: Kafka, SQS

Continuous Integration: Gitlab, Fluxcd

Monitoring: DataDog / Prometheus

Networking: Istio

Big Data: Trino, Spark, S3, etc.

Communication: Slack

Project Management: JIRA / Confluence

Other technologies: Knative Eventing / Serving, Debezium + Kafka Connect. Opensearch

Apply Now