Senior Backend Engineer - Service Monitoring & Observability

GoRustJava Remote

Binance is a leading global blockchain ecosystem behind the world’s largest cryptocurrency exchange by trading volume and registered users. We are trusted by over 280 million people in 100+ countries for our industry-leading security, user fund transparency, trading engine speed, deep liquidity, and an unmatched portfolio of digital-asset products. Binance offerings range from trading and finance to education, research, payments, institutional services, Web3 features, and more. We leverage the power of digital assets and blockchain to build an inclusive financial ecosystem to advance the freedom of money and improve financial access for people around the world.

Responsibilities

Lead or participate in the design and development of service monitoring modules, including log management, performance metrics collection, and automated anomaly alerting systems.

Build and optimize observability solutions using Prometheus, Grafana, Spring Boot Actuator, and related monitoring toolchains; own deployment, tuning, and stability improvement.

Develop customized monitoring metrics (e.g., JVM memory usage, thread pool health, API response latency) and integrate them into Prometheus/Grafana for visualization and alert rule configuration.

Analyze large-scale monitoring data to identify performance bottlenecks such as database slow queries, latency spikes, or resource contention, and drive end-to-end optimization.

Collaborate with backend, SRE, and platform engineering teams to enhance system reliability, scalability, and real-time monitoring coverage across services.

Contribute to internal tooling, automation frameworks, and best-practice guidelines to elevate observability standards across the engineering organization.

Requirements

Solid backend engineering experience with strong proficiency in Java (Spring Boot / microservices). multithreading, and JVM performance tuning.

Strong proficiency in Java, with solid understanding of microservices architecture, multithreading, and JVM performance tuning.

Experience with mainstream monitoring ecosystems such as Prometheus, Grafana, Spring Boot Actuator, and hands-on deployment and configuration in production environments.

Practical experience building custom metrics, dashboards, alerting rules, and troubleshooting end-to-end system performance issues.

Familiarity with database and middleware performance diagnostics (e.g., slow SQL, Redis/Kafka latency, connection pool tuning).

Good understanding of system-level performance concepts—CPU, memory, I/O, GC, thread pools, network stack, etc.

Experience with Go or Rust is a strong plus.

Strong analytical mindset, ownership, and ability to work in a fast-paced, highly distributed environment.

Apply Now