Binance is a leading global blockchain ecosystem behind the world’s largest cryptocurrency exchange by trading volume and registered users. We are trusted by over 280 million people in 100+ countries for our industry-leading security, user fund transparency, trading engine speed, deep liquidity, and an unmatched portfolio of digital-asset products. Binance offerings range from trading and finance to education, research, payments, institutional services, Web3 features, and more. We leverage the power of digital assets and blockchain to build an inclusive financial ecosystem to advance the freedom of money and improve financial access for people around the world.
Join our Big Data team and drive exciting AI-powered, data-driven innovation at Binance. Develop platforms that transform massive datasets into actionable insights, enhancing personalized experiences, ensuring regulatory alignment, and supporting global operations. Help shape Binance’s AI and analytics capabilities with impactful and forward-thinking solutions.
This role is 100% Remote, Work from Home based.
Responsibilities
Handle production incidents and post-mortem analysis for system stability improvementsDesigning, deploying, monitoring, and troubleshooting Kafka OR Redis clusters in PROD environment, ensuring optimal performance and reliabilityWork closely with development teams to ensure seamless deployment of applications or systemsManage and optimize Cloud infrastructure for performance, cost, and reliabilityDevelop Devops platform like online load test, change management systemContinuously explore and integrate AI-driven insights into operational processes to improve reliability, reduce noise, and empower engineering teams with intelligent decision-making.
Requirements
2-8 years of hands-on experience in Kafka OR Redis operations in large-scale production environments, able to cooperate with developers to optimize code Proficient with at least 1 public Cloud, AwS OR AliCloud OR Tencent CloudProficient in Python/Go/Java (at least one language) and SQL programming languagesHands-on experience with containerization and orchestration - KubernetesStrong experience with CI/CD tools such as GitHub Actions, Ansible, Terraform etcProficient in both English and Chinese communication for efficient cross team collaboration
Bonus
Leverage LLMs or AI frameworks (OpenAI, Dify, Agno, LangChain) to enhance automation in DevOps infrastructure operations, including intelligent alert triage, RCA (Root Cause Analysis), and chat-based operations (ChatOps)Practical experience building or operating AIOps systems (Anomaly detection, Alert correlation, Automated healing, or RCA)Familiarity with LLM-based DevOps automation (e.g., building chat-based ops assistants or AI-driven observability workflows)