Staff Software Engineer

Nium • IN

Nium, the Leader in Real-Time Global Payments

Nium, the leading global infrastructure for real-time cross-border payments, was founded on the mission to deliver the global payments infrastructure of tomorrow, today. With the onset of the global economy, its payments infrastructure is shaping how banks, fintechs, and businesses everywhere collect, convert, and disburse funds instantly across borders. Its payout network supports 100 currencies and spans 190+ countries, 100 of which in real-time. Funds can be disbursed to accounts, wallets, and cards and collected locally in 40 markets. Nium's growing card issuance business is already available in 34 countries. Nium holds regulatory licenses and authorizations in more than 40 countries, enabling seamless onboarding, rapid integration, and compliance – independent of geography. The company is co-headquartered in San Francisco and Singapore.

We are looking for an experienced Software Engineer to work as a cross-platform Architect and hands-on technologist for our Fintech Platform teams to take our platform offerings to the next level.

You will also lead our Enterprise Reliability initiatives to design, build and maintain highly available architectures for our Fintech Platforms.

In this role, you will be working closely with Full stack teams across the Nium platform and ensuring that our systems are built from the ground up with security, stability and observability requirements.

This role is a high impact, high visibility role, as you ensure that our customers get high quality software that is highly available, secure and performant.

Expectations include:

Drive process and architectural changes that can help us maintain 99.99% uptime reliably, as we scale massively across Payment platforms.

Work alongside the AWS Infrastructure team to ensure successful adoption and rollout of gitops based EKS deployments across the entire org

Work Cross-functionally with the engineering & QA leads across squads

Develop AI-Enabled tools to automate workflows and streamline productivity across the orgs

Own the Observability charter and develop cutting tools and frameworks for incorporating observability into the development culture

Simplify and modernize our Automation Frameworks to integrate AI workflows and tighter integration with our Gitops based CI/CD workflow.

RESPONSIBILITIES

Work with cross-functional teams across product, development compliance and security to understand the roadmap and translate to scalable system design

Define metrics for measuring our service availability and performance

Proactively building and implementing tools and services to make developers and tech support better at their jobs

Ensure and promote security, high availability/zero downtime and scalability in all organizational and team implementations

Critical Path Analysis and SPOF Analysis

Develop Run-Books for thorough service feature documentation and troubleshooting. e.g. when a remittance failure happens, enable faster MTTR

Organize training for Engineers to become part of On-call teams

Build and maintain AI powered tools for internal knowledge based enhancement and applications like root cause analysis, using techniques like RAG

Build AI powered tools that act as Co-Pilots for our Customer Support (CS) Agents to offer best-in-class support

Ensure feature implementations are low maintenance using continuous integration and continuous deployment methods with a focus on Automated Test feedback and rollback

Identify existing open source / proprietary tools that can solve business problems and evaluate them to help make the build or buy decision

Set the vision for Engineering Excellence in our Production Systems and Inspire action from Fullstack teams towards implementing the right design and monitoring standards

Use your strong knowledge of how Infrastructure and Application development interact, to switch comfortably between Architecture and Hands-on feature development work as needed

Improvise the Production Incident Management system and ensure that actionable insights are derived from live site incident RCAs/PostMortem sessions

Be a frontline person during Live Site Incidents and stay calm under pressure to drive collaborative resolution

QUALIFICATIONS

B.S. or M.S. Computer Science and 6+ years in software development experience

Strong software development fundamentals (Data structures, Algorithms, problem-solving, OO design, and systems architecture).

Strong understanding of object-oriented software development

Understanding of large and complex code bases, including API design techniques to help keep them clean and maintainable.

Experience with handling Live production support on large scale systems

Proficiency in 1 statically typed and 1 dynamically typed language and good knowledge of frameworks like Spring, Hibernate, Nodejs Express etc.

Knowledge of multithreading and memory management specific to mobile devices and caching mechanisms

Has passion for delivering excellence in software systems with a can-do attitude.

Experience on implementing observability platforms using any of products suites like Sumologic, Datadog, NewRelic, ELK, Prometheus

Good to have Experience with infrastructure automation and monitoring tools- Terraform, Helm, Ansible, Puppet, Chef, etc

Apply Now