Salary: INR 2500000-3000000 / year (based on experience)
Expected Notice Period: 30 Days
Shift: (GMT+05:30) Asia/Kolkata (IST)
Opportunity Type: Remote
Placement Type: Full Time Permanent position(Payroll and Compliance to be managed by: Lyzr)
(*Note: This is a requirement for one of Uplers client - Lyzr)
What do you need for this opportunity?
Must have skills required:
FinOps, AWS, Python, PowerShell, Bash, DevOps, SRE, System Engineering
Lyzr is Looking for:
At Lyzr AI, this role sits at the heart of platform reliability and scale. You will own the availability, security, and performance of mission-critical AI systems powering our customers, ensuring they run flawlessly at all times. Acting as the final escalation point, you’ll blend deep technical expertise with SRE principles to build resilient, automated, and cost-efficient cloud infrastructure
Roles & Responsibilities
System Ownership & Reliability
End-to-End Ownership: Own the health and lifecycle of production systems, ensuring high availability (HA) and meeting strict Service Level Objectives (SLOs).
Deep-Dive Debugging: Troubleshoot and resolve complex issues across infrastructure, application code, and networking layers. You will be the escalation point for hard-to-solve production incidents.
Incident Management: Lead Root Cause Analysis (RCA) processes for outages, driving permanent fixes and architectural changes to prevent recurrence.
Operational Excellence & Security
Disaster Recovery (DR): Design and manage DR strategies; conduct periodic failover drills to ensure business continuity.
Security & Compliance: Oversee OS patching, vulnerability scanning, and adherence to industry compliance standards (SOC2/HIPAA/ISO). Maintain strict IAM policies and security groups.
Observability: Build and maintain comprehensive monitoring, logging, and alerting frameworks (CloudWatch, Prometheus, Datadog) to ensure early detection of anomalies.
Maintenance: Define and maintain backup/restore processes and routine maintenance windows with minimal downtime.
IaC & Tooling: Develop automation tools and manage infrastructure using Terraform or CloudFormation, along with scripting in Python, Go, or Bash.
Self-Healing Systems: Implement auto-remediation workflows where systems can detect and resolve common issues (e.g., restarting failed services, rotating bad nodes) without human intervention.
Performance Tuning: optimize application runtime parameters, database queries, and system kernel settings for maximum throughput.
Cloud & Cost Optimization (FinOps)
AWS Management: Architect and manage extensive AWS services—EC2, EKS/ECS, RDS, S3, Lambda, VPC, and Route53.
This includes rightsizing instances, managing Reserved/Spot instances, and identifying idle resources to reduce waste.
Capacity Planning: Collaborate with engineering teams to forecast infrastructure needs, ensuring we scale to meet demand without over-provisioning.
Technical Qualifications
Must-Have Skills
Experience: 2-5 years in SRE, DevOps, or Systems Engineering roles with a strong focus on AWS.
Cloud Proficiency: Expert-level knowledge of AWS core services and architecture standards.
Scripting: Strong proficiency in Python or Shell/Bash for automation.
Cost Tools: Experience with AWS Cost Explorer, Trusted Advisor, or 3rd party tools (e.g., CloudHealth) to drive financial efficiency.
Monitoring: Hands-on experience with tools like Grafana, Prometheus, ELK Stack, or Splunk.
Preferred Qualifications
Experience in Hybrid Cloud environments (AWS + On-Prem/Data Center).
Knowledge of container orchestration (Kubernetes/EKS).
Understanding of database administration and replication (PostgreSQL, MySQL, or DynamoDB).
Interview Process -
R1 : Technical Round
R2 : Culture + Technical Round
How to apply for this opportunity?
Step 1: Click On Apply! And Register or Login on our portal.
Step 2: Complete the Screening Form & Upload updated Resume
Step 3: Increase your chances to get shortlisted & meet the client for the Interview!
About Uplers:
Our goal is to make hiring reliable, simple, and fast. Our role will be to help all our talents find and apply for relevant contractual onsite opportunities and progress in their career. We will support any grievances or challenges you may face during the engagement.
(Note: There are many more opportunities apart from this on the portal. Depending on the assessments you clear, you can apply for them as well).
So, if you are ready for a new challenge, a great work environment, and an opportunity to take your career to the next level, dont hesitate to apply today. We are waiting for you!
Searching, interviewing and hiring are all part of the professional life. The TALENTMATE Portal idea is to fill and help professionals doing one of them by bringing together the requisites under One Roof. Whether you're hunting for your Next Job Opportunity or Looking for Potential Employers, we're here to lend you a Helping Hand.
Disclaimer: talentmate.com is only a platform to bring jobseekers & employers together.
Applicants
are
advised to research the bonafides of the prospective employer independently. We do NOT
endorse any
requests for money payments and strictly advice against sharing personal or bank related
information. We
also recommend you visit Security Advice for more information. If you suspect any fraud
or
malpractice,
email us at abuse@talentmate.com.
You have successfully saved for this job. Please check
saved
jobs
list
Applied
You have successfully applied for this job. Please check
applied
jobs list
Do you want to share the
link?
Please click any of the below options to share the job
details.
Report this job
Success
Successfully updated
Success
Successfully updated
Thank you
Reported Successfully.
Copied
This job link has been copied to clipboard!
Apply Job
Upload your Profile Picture
Accepted Formats: jpg, png
Upto 2MB in size
Your application for Site Reliability Engineer
has been successfully submitted!
To increase your chances of getting shortlisted, we recommend completing your profile.
Employers prioritize candidates with full profiles, and a completed profile could set you apart in the
selection process.
Why complete your profile?
Higher Visibility: Complete profiles are more likely to be viewed by employers.
Better Match: Showcase your skills and experience to improve your fit.
Stand Out: Highlight your full potential to make a stronger impression.
Complete your profile now to give your application the best chance!