This job is with Deutsche Bank, an inclusive employer and a member of myGwork – the largest global platform for the LGBTQ+ business community. Please do not contact the recruiter directly. Position Overview
Role Description
You will work closely with application teams to ensure stable, well monitored applications that are resilient to faults. You will agree and review Service Level Objectives (SLOs) to achieve high availability for applications based on their criticality. You will maintain Error Budgets for the application teams and prevent releases in the event of production instability and reduced availability.
You will focus on reducing manual toil, improving operational reliability and driving automation-first practices. This is a hands-on role with strong focus on implementing SRE practices and reducing toil for Developer Tools.
What We’ll Offer You
As part of our flexible scheme, here are just some of the benefits that you’ll enjoy
Best in class leave policy
Gender neutral parental leaves
100% reimbursement under childcare assistance benefit (gender neutral)
Sponsorship for Industry relevant certifications and education
Employee Assistance Program for you and your family members
Comprehensive Hospitalization Insurance for you and your dependents
Accident and Term life Insurance
Complementary Health screening for 35 yrs. and above
Your Key Responsibilities
Drive stability, performance and reliability improvements for TDI Engineering applications.
Build Monitoring and alerting solutions to alert in the event of failures/performance issues across TDI Engineering applications to help us providing the optimum service level to the users.
Provide feedback loops to continually improve the application resilience across multiple application teams. Collaborate with product owners and engineering team to prioritize reliability and stability of these applications.
Define, measure and maintain SLOs and Error Budgets to ensure availability for end users and to achieve appropriate levels of application stability.
Identify opportunities for automation and self-service capabilities and implement them to eliminate toil for both the application teams and the SRE team to optimise effectiveness
Manage outage resolution and agree actions to reduce the likelihood of failure happening in future by owning RCA and conducting blameless postmortems.
Your Skills And Experience
Bachelor’s degree from an accredited college or university with a concentration in Computer Science or IT-related discipline (or equivalent work experience or diploma).
2+ Years of Experience in IT in large corporate environments, specifically in controlled production environments.
Demonstrable Site Reliability Engineering experience of at least 1+ Years.
Excellent analytical and problem-solving skills
Experience in implementing observability solution using any industry standard tools
Scripting skills (Groovy, shell, Bash, Cron or any equivalent)
Experience in mid-range technologies and platforms, i.e. UNIX/LINUX, ORACLE database and Nginx experience .
Good to have
Understanding and experience in Developer Tools (Jira, Confluence, Bitbucket, TeamCity, Artifactory, Udeploy) as an enterprise level Administrator experienced in managing applications with large user base.
Knowledge and experience of observability tools like Grafana, Prometheus.
How We’ll Support You
Training and development to help you excel in your career
Coaching and support from experts in your team
A culture of continuous learning to aid progression
A range of flexible benefits that you can tailor to suit your needs
About Us And Our Teams
Please visit our company website for further information:
https://www.db.com/company/company.htm
We strive for a culture in which we are empowered to excel together every day. This includes acting responsibly, thinking commercially, taking initiative and working collaboratively.
Together we share and celebrate the successes of our people. Together we are Deutsche Bank Group.
We welcome applications from all people and promote a positive, fair and inclusive work environment.
Searching, interviewing and hiring are all part of the professional life. The TALENTMATE Portal idea is to fill and help professionals doing one of them by bringing together the requisites under One Roof. Whether you're hunting for your Next Job Opportunity or Looking for Potential Employers, we're here to lend you a Helping Hand.
Disclaimer: talentmate.com is only a platform to bring jobseekers & employers together.
Applicants
are
advised to research the bonafides of the prospective employer independently. We do NOT
endorse any
requests for money payments and strictly advice against sharing personal or bank related
information. We
also recommend you visit Security Advice for more information. If you suspect any fraud
or
malpractice,
email us at abuse@talentmate.com.
You have successfully saved for this job. Please check
saved
jobs
list
Applied
You have successfully applied for this job. Please check
applied
jobs list
Do you want to share the
link?
Please click any of the below options to share the job
details.
Report this job
Success
Successfully updated
Success
Successfully updated
Thank you
Reported Successfully.
Copied
This job link has been copied to clipboard!
Apply Job
Upload your Profile Picture
Accepted Formats: jpg, png
Upto 2MB in size
Your application for Site Reliability Engineer Analyst
has been successfully submitted!
To increase your chances of getting shortlisted, we recommend completing your profile.
Employers prioritize candidates with full profiles, and a completed profile could set you apart in the
selection process.
Why complete your profile?
Higher Visibility: Complete profiles are more likely to be viewed by employers.
Better Match: Showcase your skills and experience to improve your fit.
Stand Out: Highlight your full potential to make a stronger impression.
Complete your profile now to give your application the best chance!