Job Description

  • Execute performance tuning activities for model serving infrastructure to maintain optimal latency and throughput.
  • Conduct post-deployment validation checks to ensure model prediction stability, API responsiveness, and overall service quality.
  • Support the enhancement of operational pipelines, including CI/CD workflows, configuration templates, and automated monitoring scripts.
  • Participate in service reliability reviews to improve platform uptime, incident response processes, and operational readiness.
  • Coordinate closely with DevOps and Platform Engineering to address infrastructure-level concerns related to model hosting and deployment.
  • Assist in the rollout of platform-level improvements, including model registry enhancements, container optimization, and new monitoring tools.

Minimum Qualifications

Key Requirements: (Must have)

  • Machine Learning Operations (MLOPS)
  • With Cloud background- AWS, GCP, Azure, Alibaba etc)
  • Understanding SQL (Data Pipelines)/ Data Engineering
  • Containerization (Docker and Kubernetes)

Other Qualifications

  • Minimum of 2+ years hands-on experience in a production environment covering MLOps, Data Engineering, or Software Engineering.
  • Demonstrated ability to meet and exceed strict Service Level Agreements (SLAs), especially those related to system uptime, stability, incident response, and resolution.
  • Experience supporting cloud-hosted ML systems in distributed, high-availability environments.

Knowledge

  • Strong understanding of model deployment workflows, including model versioning, serving, rollout strategies, and post-deployment validation.
  • Knowledge of cloud platforms (e.g., AWS Cloud) and their native ML services used for hosting, monitoring, and managing model endpoints.
  • Familiarity with containerization (Docker) and orchestration (Kubernetes) for scalable ML serving infrastructure.
  • Understanding of performance monitoring concepts, including latency tracking, model health indicators, and drift signals.
  • Knowledge of CI/CD processes, configuration templates, and automated operational workflows specific to ML systems.

Skills

  • Proven expertise in MLOps, specifically managing model deployment, proactive monitoring, incident resolution, and performance tuning.
  • Ability to write and maintain automation scripts, validation utilities, and operational workflows to support ML pipelines.
  • Ability to collaborate effectively with DevOps, Data Science, and Platform Engineering teams to improve model reliability and system stability.
  • Skilled in applying structured software development methodologies (e.g., Agile/Scrum) to support platform enhancements and iterative delivery.
  • Strong analytical, troubleshooting, and root-cause diagnosis skills in production environments.


Job Details

Role Level: Mid-Level Work Type: Full-Time
Country: Philippines City: Taguig National Capital Region
Company Website: http://www.yondu.com Job Function: Data Science & AI
Company Industry/
Sector:
Information Technology and Services

What We Offer


About the Company

Searching, interviewing and hiring are all part of the professional life. The TALENTMATE Portal idea is to fill and help professionals doing one of them by bringing together the requisites under One Roof. Whether you're hunting for your Next Job Opportunity or Looking for Potential Employers, we're here to lend you a Helping Hand.

Report

Disclaimer: talentmate.com is only a platform to bring jobseekers & employers together. Applicants are advised to research the bonafides of the prospective employer independently. We do NOT endorse any requests for money payments and strictly advice against sharing personal or bank related information. We also recommend you visit Security Advice for more information. If you suspect any fraud or malpractice, email us at abuse@talentmate.com.


Recent Jobs
View More Jobs
Talentmate Instagram Talentmate Facebook Talentmate YouTube Talentmate LinkedIn