Job Description

We are seeking an experienced Solution Architect specializing in GPU/TPU Kernel Optimization to design and optimize high-performance kernels for cutting-edge Machine Learning operations. In this role, you will redefine performance boundaries across massive training runs and high-speed inference workloads while shaping the developer infrastructure that powers next-generation AI systems.

 

Responsibilities

  • Design and optimize high-performance kernels (using languages like Pallas, Mosaic and Triton) targeting Tensor Processing Unit (TPU) and Graphics Processing Unit (GPU) architectures for critical Machine Learning (ML) operations, redefining whats possible from massive training runs to high-speed inference
  • Architect infrastructure such as benchmarking suites, autotuning frameworks, performance analysis tools, regression testing and documentation
  • Transform how the developer community interacts with increasingly critical custom kernels in key Open-Source Software (OSS) libraries
  • Track the latest advancements in hardware architectures, compiler technologies and AI models to identify new opportunities for performance optimization through custom kernels
  • Engage with ML researchers, framework developers (Just After eXecution (JAX), PyTorch) and compiler engineers (Accelerated Linear Algebra (XLA)) to enhance adoption
  • Identify new requirements and address bottlenecks by providing appropriate solutions

Requirements

  • 12-18 years of experience in software development
  • Expertise in optimizing TPU/GPU code using low-level kernel languages like Pallas, Compute Unified Device Architecture (CUDA) or Triton
  • Knowledge of ML Frameworks (JAX/PyTorch), common operations like attention and Mixture of Experts (MoEs), including model optimization and low-precision formats
  • Understanding of modern accelerators (e.g., data movement, pipelining, heterogeneous compute and scale-out)
  • Understanding of compiler principles (optimization, code generation) and toolchains such as MLIR, OpenXLA
  • Showcase of building developer infrastructure, including Open-Source Software (OSS) libraries, flexible high-performance APIs and easy-to-consume documentation to empower the community
  • Excellent investigative and problem-solving capabilities with communication skills across cross-functional teams


Job Details

Role Level: Associate Work Type: Full-Time
Country: India City: Gurgaon ,Haryana
Company Website: http://www.epam.com Job Function: Software Development
Company Industry/
Sector:
Software Development and IT Services and IT Consulting

What We Offer


About the Company

Searching, interviewing and hiring are all part of the professional life. The TALENTMATE Portal idea is to fill and help professionals doing one of them by bringing together the requisites under One Roof. Whether you're hunting for your Next Job Opportunity or Looking for Potential Employers, we're here to lend you a Helping Hand.

Report

Disclaimer: talentmate.com is only a platform to bring jobseekers & employers together. Applicants are advised to research the bonafides of the prospective employer independently. We do NOT endorse any requests for money payments and strictly advice against sharing personal or bank related information. We also recommend you visit Security Advice for more information. If you suspect any fraud or malpractice, email us at abuse@talentmate.com.


Recent Jobs
View More Jobs
Talentmate Instagram Talentmate Facebook Talentmate YouTube Talentmate LinkedIn