Sr. Site Reliability Engineer - 10823

Coupang • IN

JavaPython Hybrid

Coupa makes margins multiply through its community-generated AI and industry-leading total spend management platform for businesses large and small. Coupa AI is informed by trillions of dollars of direct and indirect spend data across a global network of 10M+ buyers and suppliers. We empower you with the ability to predict, prescribe, and automate smarter, more profitable business decisions to improve operating margins.

Why join Coupa?

🔹 Pioneering Technology: At Coupa, we're at the forefront of innovation, leveraging the latest technology to empower our customers with greater efficiency and visibility in their spend.

🔹 Collaborative Culture: We value collaboration and teamwork, and our culture is driven by transparency, openness, and a shared commitment to excellence.

🔹 Global Impact: Join a company where your work has a global, measurable impact on our clients, the business, and each other.

Learn more on Life at Coupa blog and hear from our employees about their experiences working at Coupa.

What You'll Do:

Responsible for building and provisioning enterprise-grade data, messaging, and analytics platforms in the public cloud

Ensure that data, services, and infrastructures are reliable, fault-tolerant, efficiently scalable, and cost-effective

Administration of Linux machines, web servers, application servers, databases, and infrastructure support for products and businesses

Own end-to-end availability and performance of mission-critical services and build automation to prevent problem recurrence

Develop tools and automation using Ruby, python, etc., to increase availability and performance

Collaborate with Product and Release Engineering for new product releases and maintenance

Coordinate change management

Participate in incident response and blameless post mortems

Participate in 24×7 on-call rotation for after-hours emergencies

What You Will Bring to Coupa:

Bachelor’s degree and 7+ years of professional experience

3+ years of production support for Elasticsearch/Redis/Kafka (Elasticsearch experience is a must)

3+ years of production system administration and web operations experience

2+ years of programming experience in Ruby, Java, Perl, Python, or equivalent

2+ years of experience with configuration management tools such as Chef, Puppet, Salt, or equivalent

Experience with AWS or a comparable cloud provider

Experience with Infrastructure-as-Code products like Terraform

Experience in massive-scale web operations

Expertise in problem-solving and analyzing globally distributed systems

Excellent written and verbal communication skills

Apply Now