Job Description

About The Opportunity

A fast-paced Cloud Infrastructure & Managed Services organization operating in enterprise cloud operations and platform reliability. We deliver 24/7 production support, incident management, and automation for large-scale AWS environments serving customers across India. This on-site role (India) is focused on stabilizing production systems, reducing operational toil, and driving reliability improvements.

Primary title (standardized): AWS Production Support Engineer

Role & Responsibilities

  • Monitor, triage, and resolve production incidents in AWS; own incident lifecycle from detection to closure and perform timely escalations to engineering teams.
  • Troubleshoot infrastructure and application issues across EC2, RDS, VPC, ELB, S3, and Lambda; coordinate cross-functional remediation actions.
  • Maintain and improve infrastructure-as-code templates and pipelines using CloudFormation and Terraform to ensure safe, repeatable deployments.
  • Author and update runbooks, automation scripts (Bash/Python), and operational playbooks to reduce manual intervention.
  • Operate and tune observability and alerting stacks (CloudWatch, Prometheus, Grafana, ELK); create dashboards and actionable alerts to minimise noise and improve MTTR.
  • Participate in on-call rotations, conduct RCA sessions, document findings, and drive corrective actions to improve system reliability and capacity planning.

Skills & Qualifications

Must-Have

  • AWS EC2
  • AWS S3
  • AWS CloudWatch
  • AWS IAM
  • AWS Lambda
  • AWS RDS
  • Terraform
  • CloudFormation

Preferred

  • Docker
  • Kubernetes
  • Prometheus

Additional Qualifications

  • Proven experience in production support or site-reliability/DevOps roles for AWS-hosted applications (on-site availability in India required).
  • Strong Linux administration and scripting ability (Bash or Python) for automation and troubleshooting.
  • Experience with CI/CD tooling (e.g., Jenkins/GitLab) and familiarity with logging/observability best practices.

Benefits & Culture Highlights

  • Hands-on ownership of critical production systems with clear impact on customer experience.
  • Learning-focused environment with opportunities to upskill in IaC, monitoring, and reliability engineering.
  • Competitive compensation, on-site collaboration, and a structured on-call rota.

Skills: terraform,prometheus,docker,aws production analyst,kubernetes,aws lambda


Job Details

Role Level: Mid-Level Work Type: Full-Time
Country: India City: Hyderabad ,Telangana
Company Website: www.viraajhrsolutions.com Job Function: Information Technology (IT)
Company Industry/
Sector:
Software Development

What We Offer


About the Company

Searching, interviewing and hiring are all part of the professional life. The TALENTMATE Portal idea is to fill and help professionals doing one of them by bringing together the requisites under One Roof. Whether you're hunting for your Next Job Opportunity or Looking for Potential Employers, we're here to lend you a Helping Hand.

Report

Disclaimer: talentmate.com is only a platform to bring jobseekers & employers together. Applicants are advised to research the bonafides of the prospective employer independently. We do NOT endorse any requests for money payments and strictly advice against sharing personal or bank related information. We also recommend you visit Security Advice for more information. If you suspect any fraud or malpractice, email us at abuse@talentmate.com.


Recent Jobs
View More Jobs
Talentmate Instagram Talentmate Facebook Talentmate YouTube Talentmate LinkedIn