Job Description

Overview

This role is responsible for ensuring the overall stability of production application. Reliability, availability, scalability, and efficiency of our production systems and platforms. The Operations Engineer will collaborate with cross-functional teams—including Software Engineering, Service Reliability, Infrastructure, and Business Operations—to streamline processes, manage day to day operations, monitor system health, and quickly resolve incidents.

The ideal candidate must be skilled in problem-solving, process automation, and root cause analysis, with a passion for operational excellence and continuous improvement.

Responsibilities

  • Monitor production systems, applications, and infrastructure to ensure high availability and performance.
  • Troubleshoot and resolve operational issues, providing timely escalation and communication to stakeholders.
  • Perform root cause analysis (RCA) and drive permanent fixes to recurring problems.
  • Manage RTS & TTS, configuration changes, and production rollouts with minimal impact.
  • Develop and maintain runbooks, standard operating procedures, and technical documentation.
  • Automate operational workflows, monitoring, and reporting using scripts and tools.
  • Collaborate with engineering teams to design for reliability, scalability, and operability.
  • Support incident response, disaster recovery, and business continuity processes.
  • Drive continuous improvement initiatives around system monitoring, alerting, and incident response.
  • Ensure compliance with IT controls, security policies, and audit requirements.

Qualifications

  • Bachelor’s degree in Computer Science, Information Technology, Engineering, or a related field (or equivalent experience).
  • 5+ years of experience in operations engineering, site reliability engineering, or systems administration.
  • Strong knowledge of Linux/Unix and/or Windows server environments.
  • Experience with monitoring and alerting tools (Grafana, Datadog, Splunk, Nagios).
  • Proficiency in at least one scripting/programming language (e.g., Python, Bash, PowerShell).
  • Familiarity with CI/CD pipelines, deployment automation, and configuration management (e.g., Jenkins).
  • Understanding of networking fundamentals (DNS, TCP/IP, load balancing, firewalls).
  • Hands-on experience with cloud platforms (AWS, Azure, GCP).


Job Details

Role Level: Mid-Level Work Type: Full-Time
Country: India City: Hyderabad ,Telangana
Company Website: http://www.pepsico.com Job Function: Information Technology (IT)
Company Industry/
Sector:
Food and Beverage Services and Manufacturing

What We Offer


About the Company

Searching, interviewing and hiring are all part of the professional life. The TALENTMATE Portal idea is to fill and help professionals doing one of them by bringing together the requisites under One Roof. Whether you're hunting for your Next Job Opportunity or Looking for Potential Employers, we're here to lend you a Helping Hand.

Report

Similar Jobs

Disclaimer: talentmate.com is only a platform to bring jobseekers & employers together. Applicants are advised to research the bonafides of the prospective employer independently. We do NOT endorse any requests for money payments and strictly advice against sharing personal or bank related information. We also recommend you visit Security Advice for more information. If you suspect any fraud or malpractice, email us at abuse@talentmate.com.


Talentmate Instagram Talentmate Facebook Talentmate YouTube Talentmate LinkedIn