Site Reliability Engineer

Momentum Financial Services Group · Toronto, Canada · 96d ago

Hybrid/Remote Python

About the role

Who We Are
At Momentum Financial Services Group, we help people move forward by reimagining how money works for those who need it most. With more than 40 years of experience, we’re the team behind Money Mart—Canada’s largest non-bank branch network—and a leader in financial solutions for underserved communities.

From short-term loans to money transfers and prepaid cards, we power the products, technology, and operations that connect millions of customers each year to the money they need, when they need it.

At MFSG, we work together across teams and functions to create something bigger than ourselves: solutions that remove barriers and give people access to money they might not get anywhere else. Whether you’re solving problems, building systems, or shaping strategy, your work fuels real support for real people.

We’ve Got You Covered

Compensation Philosophy – Competitive pay aligned with experience and market standards
Discretionary Annual Bonus – Rewarding both individual and company performance
Comprehensive Benefits – Health and dental coverage with premiums fully paid, plus access to an Employee Assistance Program
Retirement Plans – Helping you plan and save for the future
Hybrid Work Environment – Flexibility to balance remote and in-office collaboration; enjoy our corporate HQ spaces designed for teamwork and creativity
Perks and Rewards – Tuition reimbursement, professional development support, discounts through Perkopolis, and recognition programs that celebrate your impact

The Job: Site Reliability Engineer

The Site Reliability Engineer is responsible for ensuring the availability, performance, and resilience of the organization's digital banking and financial services platforms. This role focuses on automating operational processes, defining and maintaining service-level objectives, and engineering systems that can withstand and recover from failure. You will work closely with engineering, DevOps, QA, cybersecurity, and compliance teams to ensure platform reliability meets both technical and regulatory standards, while minimizing risk to production systems through proactive monitoring, incident response, and continuous improvement of the software delivery lifecycle.

How You’ll Make an Impact: 

Reliability Planning & Governance

Define and maintain service-level objectives (SLOs), error budgets, and reliability targets aligned with business goals and compliance deadlines.
Oversee the end-to-end service lifecycle, from code integration to production deployment, with a focus on stability and risk reduction.
Ensure all changes comply with relevant financial regulations.
Conduct reliability risk and blast-radius assessments before production changes.
Coordinate go/no-go decisions with engineering, QA, compliance, and operations stakeholders.

Execution & Delivery

Own build, test, and deployment pipelines across multiple environments (staging, UAT, production), ensuring changes are safe, repeatable, and observable.
Design and maintain automated CI/CD pipelines and enforce version control policies (e.g., Git Flow) to reduce toil and human error.
Engineer zero-downtime deployments and low-impact change strategies for high-availability systems.
Develop and maintain rollback, failover, and disaster recovery runbooks for production incidents.

Compliance & Security Oversight

Collaborate with Information Security and Compliance teams to validate that infrastructure and deployment practices meet data protection and privacy standards.
Maintain audit-ready documentation of change activity, incident timelines, and remediation records.
Support internal and external audits with detailed operational and change history.

Continuous Improvement

Drive automation, standardization, and observability improvements across the production environment.
Conduct post-incident reviews (blameless post-mortems) to identify systemic failures and prevent recurrence.
Contribute to DevOps and SRE maturity initiatives across engineering teams.

Stakeholder Communication

Act as the central liaison between product, development, and compliance teams on production health and change risk.
Communicate change scope, reliability risks, and incident status clearly to both technical and non-technical stakeholders.
Provide regular reliability reporting, SLO performance metrics, and incident trends to senior management.

What You Bring:

Technical Proficiency:

CI/CD tools
Cloud platforms (AWS, Azure).
Containers and orchestration (Docker, Kubernetes).
Scripting languages (Python, Bash).
Infrastructure as Code (Terraform, Ansible).
Observability and monitoring tools

Soft Skills:

Strong cross-functional collaboration and communication across engineering and compliance teams.
Rigorous attention to detail with a proactive approach to risk and failure detection.
Ability to perform under pressure and respond decisively during incidents and regulatory deadlines.

Education + Experience:

Bachelor's degree in Computer Science, Information Technology, or related field.
3-5 years in Site Reliability Engineering, DevOps, or Platform Engineering within financial services or fintech.
Hands-on experience maintaining reliability for real-time transaction systems, mobile banking, or payment gateways.
Familiarity with regulatory compliance requirements and their operational implications for production systems.

Ready to apply your Site Reliability Engineering expertise to make a real impact? Join us and help shape the future of tech at MFSG. Apply today and let’s build the future of MFSG, together.

Committed to Equal Opportunity:

MFSG is committed to accommodating applicants up to the point of undue hardship during the recruitment, assessment and selection process. If you are selected for an interview, please notify MFSG if you require accommodation in respect of the materials or procedures used at any time during this process. If you require accommodation, MFSG will work with you to determine how to meet your needs.

Please note: The salary range for this position is between C$ 110,000 to C$ 120,000.

About MFSG – Our Commitment to Responsible Innovation

At MFSG, we are committed to building innovative solutions grounded in ethical, transparent, and responsible use of data and technology. Aligned with the principles outlined in Canada’s Artificial Intelligence and Data Act (AIDA), we take a proactive approach to ensuring that any AI or data-driven systems we use are safe, fair, and accountable.

This posting is for a current position within our organization, offering the opportunity to contribute to meaningful, responsible innovation that supports our employees, clients, and communities.

We prioritize strong data governance, clear communication around how systems work, and safeguards that reduce risks and protect individuals. Our focus is on developing tools and processes that promote equity, reliability, and trust, supported by ongoing monitoring and continuous improvement.

Joining MFSG means contributing to a future-focused organization that values both innovation and integrity, where your work helps shape solutions that responsibly support our employees, clients, and communities.

Tech stack

Python

Arrangement Hybrid/Remote

Location Toronto, Canada

Posted 96d ago

findatechjob

Tech jobs straight from company career pages. No recruiters, no middlemen, no spam.

Countries

United States United Kingdom Germany Canada

Languages

Python TypeScript Go Rust

Company

About