At EY, you’ll have the chance to build a career as unique as you are, with the global scale, support, inclusive culture and technology to become the best version of you. And we’re counting on your unique voice and perspective to help EY become even better, too. Join us and build an exceptional experience for yourself, and a better working world for all.
Senior Reliability Engineer:
Senior Reliability Engineer (Senior level)
Description
Reliability Engineering (SRE) is a modern way of delivering IT Solutions by imbibing Software engineering principles in Service Delivery to reduce IT Risk to business, improve business resilience, attain predictability & reliability, optimize cost of IT Infra and Ops
A Reliability Engineer typically has deep software engineering experience encompassing design, build, deploy and manage / maintain an IT solution ensuring resilience, reliability, and performance.
A Reliability Engineer is a bridge between development and operations by applying a software engineering mindset to the development, deployment, and maintenance of applications to maximize system reliability & automation, while improving efficiencies by optimizing resources
Responsibilities
Defining SLA/SLO/SLI for a product / service
Engineering in resilient design and implementation practices into solutions as they go through the product life cycle
Engineering out manual effort (Toil) through the development of automated processes and services (e.g., Automated Management of Systems, CI/CD improvements)
Developing Observability Solutions to track, report, and measure SLA adherence
Help Optimize Cost of IT Infra & Operations - FinOps
Critical Situation management
SOP / Runbook automation, Toil reduction
Data Analytics & System trend analysis
Typical Skills And Background
7+ years of experience in software product engineering principles, processes and systems
Hands-on experience in Java / J2EE, one of web server (Apache Tomcat or IBM HTTP Server), one of the application servers (Tomcat/WebSphere), and any major RDBMS like Oracle
Hands-on experience in at least one CI-CD (Azure DevOps, GitLab CI/CD, Jenkins) and IaC tools (Terraform, AWS CloudFormation, Ansible etc.)
Experience in at least one cloud technology (AWS/Azure/GCP etc. and Docker, Pivotal, Kubernetes, OpenShift etc.) and its reliability tools (Azure AppInsight, CloudWatch, Azure Monitor etc.)
Experience in Linux (RHEL) operating system performance monitoring parameters and their interpretation, commands used for monitoring
Experience in Observability - APM tools (Dynatrace, AppDynamics etc.), metrics / log consolidation (Splunk) and ELK Stack
Defining NFRs and SLA/SLO/SLI agreement for a product / platform / services
Knowledge on queuing models used, thread pools, request servicing processes etc.
Knowledge in Web Services, SOA, ESB (DataPower), RESTFul
Knowledge of application design patterns, J2EE application architectures, Microservices, Spring boot & Cloud native architectures
Experience in performance tuning on Application Servers (Tomcat/WAS)
Experience in trouble shooting Performance / Scalability / Availability issues
Experience in Thread dump, heap dump generation & analysis
Knowledge on Query tuning and database designs & models
Knowledge at least one automation scripting language like Python
Mastery in collaborative software development using Git, Jira, Confluence etc.
AI/ML & Data Analytics knowledge and experience is a desirable
EY | Building a better working world
EY exists to build a better working world, helping to create long-term value for clients, people and society and build trust in the capital markets.
Enabled by data and technology, diverse EY teams in over 150 countries provide trust through assurance and help clients grow, transform and operate.
Working across assurance, consulting, law, strategy, tax and transactions, EY teams ask better questions to find new answers for the complex issues facing our world today.
Searching, interviewing and hiring are all part of the professional life. The TALENTMATE Portal idea is to fill and help professionals doing one of them by bringing together the requisites under One Roof. Whether you're hunting for your Next Job Opportunity or Looking for Potential Employers, we're here to lend you a Helping Hand.
Disclaimer: talentmate.com is only a platform to bring jobseekers & employers together.
Applicants
are
advised to research the bonafides of the prospective employer independently. We do NOT
endorse any
requests for money payments and strictly advice against sharing personal or bank related
information. We
also recommend you visit Security Advice for more information. If you suspect any fraud
or
malpractice,
email us at abuse@talentmate.com.
You have successfully saved for this job. Please check
saved
jobs
list
Applied
You have successfully applied for this job. Please check
applied
jobs list
Do you want to share the
link?
Please click any of the below options to share the job
details.
Report this job
Success
Successfully updated
Success
Successfully updated
Thank you
Reported Successfully.
Copied
This job link has been copied to clipboard!
Apply Job
Upload your Profile Picture
Accepted Formats: jpg, png
Upto 2MB in size
Your application for DET-TT-Resilience And Reliability Engineer
has been successfully submitted!
To increase your chances of getting shortlisted, we recommend completing your profile.
Employers prioritize candidates with full profiles, and a completed profile could set you apart in the
selection process.
Why complete your profile?
Higher Visibility: Complete profiles are more likely to be viewed by employers.
Better Match: Showcase your skills and experience to improve your fit.
Stand Out: Highlight your full potential to make a stronger impression.
Complete your profile now to give your application the best chance!