Senior AI/ML Engineer - Site Reliability Engineering
RBRoyal Bank of Canada — Toronto, Ontario🇨🇦
Job Details
Description
Job Description
WHAT IS THE OPPORTUNITY?
Join RBC's Site Reliability Engineering team as a founding member building the bank's first-ever Agentic AI platform for Software reliability and resiliency. You'll pioneer intelligent automation systems that autonomously prevent incidents, accelerate response times, and transform how we maintain resilience across enterprise systems. This is a rare opportunity to shape the future of AI-driven reliability at scale. Your innovations will protect millions of daily customer transactions and sign-ins. With a clear technical leadership trajectory, you'll architect cutting-edge solutions at the intersection of AI and infrastructure, setting the standard for autonomous operations in financial services.
WHAT WILL YOU DO?
• Design and implement end-to-end Agentic AI solutions that autonomously detect anomalies, identify root causes, and resolve incidents with minimal human intervention
• Develop intelligent automation frameworks using LangChain and LangGraph to create context-aware agents that learn from incident patterns and continuously improve response strategies
• Build ML-powered monitoring and alerting systems that distinguish signal from noise, dramatically reducing false positives and improving MTTD (Mean Time to Detect) and MTTI (Mean Time to Identify)
• Architect scalable, production-grade solutions on OpenShift and Kubernetes that process real-time system metrics and telemetry data at enterprise scale
• Implement infrastructure-as-code using Ansible and containerization (Docker) to ensure reproducibility, consistency, and rapid deployment across environments
• Partner with incident management and operations teams to translate operational pain points into AI-driven automation opportunities that measurably reduce toil
• Establish and track KPIs focused on reducing MTTR (Mean Time to Resolve), MTTD, and MTTI while improving system reliability
• Lead technical design discussions and contribute to architectural decisions that shape RBC's AI-powered reliability strategy
WHAT DO YOU NEED TO SUCCEED?
Must have:
• Strong ML engineering background with hands-on experience designing, training, and deploying machine learning models in production environments
• Proven expertise in Agentic AI frameworks and tools (LangChain, LangGraph, AutoGen, CrewAI, or similar) and building autonomous, multi-agent systems
• Deep understanding of Model Context Protocol (MCP) for enabling AI agents to interact with external systems and data sources
• Experience building AI agents with tool-calling capabilities, memory management, and reasoning chains
• Proficiency in Python and experience with ML libraries (scikit-learn, TensorFlow, PyTorch, or similar)
• Working knowledge of containerization (Docker), orchestration (Kubernetes/OpenShift), and infrastructure-as-code principles (Ansible, Terraform)
• Demonstrated ability to translate complex technical concepts into business value and collaborate effectively with cross-functional teams
Nice-to-have:
• Prior experience in Site Reliability Engineering, DevOps, or infrastructure monitoring roles
• Familiarity with observability tools (Prometheus, Grafana, ELK stack) and incident management platforms (PagerDuty, ServiceNow)
• Experience with LLMs, prompt engineering, and retrieval-augmented generation (RAG) architectures
• Background in financial services or other highly regulated industries with strict reliability requirements
What's in it for you?
We thrive on the challenge to be our best, progressive thinking to keep growing, and working together to deliver trusted advice to help our clients thrive and communities prosper. We care about each other, reaching our potential, making a difference to our communities, and achieving success that is mutual.
• A comprehensive Total Rewards Program including bonuses and flexible benefits, competitive compensation, commissions, and stock where applicable
• Leaders who support your development through coaching and managing opportunities
• Ability to make a difference and lasting impact
• Work in a dynamic, collaborative, progressive, and high-performing team
• A world-class training program in financial services
• Flexible work/life balance options
• Opportunities to do challenging work
#LI-POST
#TECHPJ
Job Skills
Docker Kubernetes Architecture, LangChain (FrameWork), LangGraph, Machine Learning (ML), Python (Programming Language), Red Hat Ansible, Red Hat OpenShift
Additional Job Details
Address:
RBC WATERPARK PLACE, 88 QUEENS QUAY W:TORONTO
City:
Toronto
Country:
Canada
Work hours/week:
37.5
Employment Type:
Full time
Platform:
TECHNOLOGY AND OPERATIONS
Job Type:
Regular
Pay Type:
Salaried
Posted Date:
2026-04-27
Application Deadline:
2026-05-29
Note: Applications will be accepted until 11:59 PM on the day prior to the application deadline date above
Our Employment Opportunities
At RBC, we are guided by living shared values of Client First, Integrity, Collaboration, Respect and Excellence and winning together as One RBC. We believe an inclusive workplace that has diverse perspectives is core to our continued growth as one of the largest and most successful banks in the world. Maintaining a workplace where our employees feel supported to perform at their best, effectively collaborate, drive innovation, and grow professionally helps to bring our Purpose to life and create value for our clients and communities. RBC strives to deliver this through policies and programs intended to foster a workplace based on respect, belonging and opportunity for all.
Join our Talent Community
Stay in-the-know about great career opportunities at RBC. Sign up and get customized info on our latest jobs, career tips and Recruitment events that matter to you.
Expand your limits and create a new future together at RBC. Find out how we use our passion and drive to enhance the well-being of our clients and communities at jobs.rbc.com.
RBC is presently inviting candidates to apply for this existing vacancy. Applying to this posting allows you to express your interest in this current career opportunity at RBC. Qualified applicants may be contacted to review their resume in more detail.
Comments
Sign in to leave a comment
Verification
75/ 100high
+Uses recruiter-style language ("our client"), suggesting a third-party posting
+Recently posted (4 days ago) -- actively hiring
−Matches a ghost job pattern: "join our talent community"
Agency
Yes
Verified by
system on May 11
Trust Signals
Listing Age
28 days
Multi-Source
Single source
Repost Count
0
First Seen
May 11
Last Seen
May 14
Company
More jobs at Royal Bank of Canada
Similar jobs in other countries
Senior Engineer-AI/ML
100TEKsystems · 🇺🇸 Baltimore, Maryland
Senior Data Scientist
70Ford Motor Company · 🇺🇸 Dearborn, Michigan
Senior AI Data Platform Engineer (m/w/d)
65Recare Deutschland GmbH · 🇩🇪 Berlin
Senior Linux Kernel Engineer - High-Performance Computing
65The Next Chapter W&S · 🇺🇸 United States
Sr. AI & Data Scientist
65Alignment Health · 🇺🇸 United States
Senior Data Platform Engineer
65Flix · 🇩🇪 Berlin, Berlin, Germany
About Job Verification