Nord Security

Senior Site Reliability Engineer — Observability Engineer | NordVPN

Nord Security • LT
Python Hybrid
At NordLayer, we’re building cybersecurity that scales with the business.

A toggle-ready platform that helps modern teams thrive—without the security headaches. Trusted by 11,000+ global companies, NordLayer plugs into any tech stack and protects users across borders.

Your impact? Helping businesses stay protected and moving forward with future-ready network security.

NordVPN runs a global edge infrastructure serving millions of users. Knowing what's happening across that infrastructure - in real time, at scale, without drowning in noise - is what this role exists to solve.

We're looking for a Senior Site Reliability Engineer focused on observability: designing monitoring systems, improving signal quality, reducing alert fatigue, and collaborating with data teams on anomaly detection. You'll own how we understand the health and behavior of our distributed systems.

Main Responsibilities

  • Design, build, and improve monitoring pipelines and observability tooling across globally distributed infrastructure
  • Define and implement service-level monitoring based on golden signals (latency, traffic, errors, saturation)
  • Reduce alert fatigue - build meaningful, actionable alerts that engineers trust
  • Develop and maintain custom exporters, scripts, and integrations for metrics and log collection
  • Collaborate with the data team on anomaly detection and data-driven operational insights
  • Understand service signals - know what to measure, why, and what the numbers actually mean
  • Core Requirements

  • Distributed systems observability - monitoring architecture, signal design, dashboarding
  • Golden signal thinking - you design monitoring around what matters, not what's easy to measure
  • Alert design - reducing noise, building actionable alerts, managing on-call sanity
  • Python - scripting, custom exporters, automation, data processing
  • Linux administration and debugging
  • Networking fundamentals
  • Bonus Points For

  • SaltStack
  • Advanced networking - traffic analysis, protocol-level debugging
  • Advanced data knowledge - aggregation strategies, downsampling, cardinality management, retention trade-offs
  • Proven track record of onboarding new systems/services into monitoring from scratch
  • Familiarity with agentic engineering - Claude Code, LLM integrations, MCP workflows
  • Tools You Will Use

  • Naemon (Nagios) and Gearmand
  • Prometheus-based exporters
  • Telegraf
  • Fluent Bit
  • VictoriaMetrics ecosystem
  • OpenSearch
  • Grafana
  • Salary Range

  • Gross Salary 5800-7400 EUR/Month