Senior SRE DevOps Engineer (Remote from Romania)

TypeScriptPython Remote

This position is posted by Jobgether on behalf of a partner company. We are currently looking for a Senior SRE DevOps Engineer in Romania.

This is a high-impact role at the intersection of software engineering and cloud operations, focused on building and maintaining resilient, large-scale infrastructure for real-time communication systems. You will design, automate, and optimize cloud-native environments that support mission-critical connectivity under strict latency and reliability constraints. The position combines hands-on coding with deep operational ownership, empowering you to shape infrastructure strategy while improving developer productivity. Working in a remote-first, highly technical environment, you’ll collaborate across engineering teams to ensure scalability, security, and performance. If you thrive on solving distributed systems challenges and building production-grade reliability tooling, this role offers both ownership and influence.

Accountabilities:

Designing and implementing SLI/SLO frameworks with error budgets to guide reliability and performance decisions.

Building and maintaining AWS-based production infrastructure using Infrastructure as Code (Terraform, CloudFormation), including ECS, EKS/Kubernetes, and microservices orchestration.

Developing internal tools, automation frameworks, and reliability services in TypeScript, Python, or similar languages to enhance operational efficiency.

Leading incident response processes, conducting root cause analyses, and creating automated runbooks to reduce MTTR.

Architecting and maintaining CI/CD pipelines for backend services, mobile applications, and IoT firmware across cloud and on-prem environments.

Implementing comprehensive observability using OpenTelemetry, distributed tracing, metrics exporters, and alerting systems.

Managing data services such as PostgreSQL (RDS), Redis/ElastiCache, SQS, and networking components (ALB/NLB, VPC, IAM).

Enforcing strong security standards, including IAM policies, encryption, secrets management, vulnerability management, and compliance auditing.

Requirements:

The ideal candidate is both a strong software engineer and an experienced platform reliability expert. Key qualifications include:

7+ years of experience in SRE, DevOps, or Platform Engineering roles with daily hands-on coding responsibilities.

Proficiency in at least one backend language (TypeScript/Node.js, Python, or Go) for developing automation tools, internal services, and reliability frameworks.

Deep expertise in AWS services (ECS, EKS, RDS, ElastiCache, SQS, VPC, IAM, CloudWatch).

Strong experience with Infrastructure as Code tools (Terraform, CloudFormation, or Pulumi), including modular design and state management.

Proven experience designing and maintaining CI/CD pipelines in both cloud and on-prem environments.

Solid understanding of container orchestration (Docker, Kubernetes, Helm) and distributed systems patterns such as circuit breakers, retries, and graceful degradation.

Experience operating production databases (PostgreSQL, Redis) and message queues.

Strong security knowledge covering network segmentation, encryption, secrets management, and incident response.

Preferred experience with real-time communication infrastructure (SIP, RTP, WebRTC), telecom systems, IoT pipelines, or satellite/low-bandwidth optimization environments.

Benefits:

Competitive compensation package

Flexible remote work environment with autonomy and ownership

Opportunity to build and scale critical communication infrastructure

Exposure to cutting-edge technologies across cloud, IoT, telecom, and distributed systems

High-impact role with direct influence on reliability and platform architecture

Collaborative, technically advanced engineering culture

Apply Now