AryaXAI stands at the forefront of AI innovation, revolutionizing AI for mission-critical businesses by building explainable, safe, and aligned systems that scale responsibly. Our mission is to create AI tools that empower researchers, engineers, and organizations to unlock AIs full potential while maintaining transparency and safety.
Our team thrives on a shared passion for cutting-edge innovation, collaboration, and a relentless drive for excellence. At AryaXAI, everyone contributes hands-on to our mission in a flat organizational structure that values curiosity, initiative, and exceptional performance.
Requirements
Core Languages: CUDA, Python
Frameworks: CUTLASS, pybind11 (or similar, model serving frameworks)
Tools: Nsight, JAX/XLA bindings
Focus Areas: GPU Kernel Optimization, Deep Learning Inference & Training
Role Overview
We are looking for a highly skilled AI Researcher - GPU Kernel Developer to join our team and push the boundaries of high-performance AI computation. In this role, you will design, develop, and optimize GPU kernels that power state-of-the-art AI models. Your work will directly influence the performance and scalability of our AI systems.
Key Responsibilities
Develop and refine low-level CUDA kernel optimizations for deep learning inference and training.
Profile, debug, and optimize single and multi-GPU operations using tools like Nsight.
Deeply understand and exploit GPU memory hierarchies and computational capabilities.
Implement cutting-edge methods from research papers into CUDA kernels.
Collaborate on designing innovative solutions to achieve peak GPU performance.
Ideal Candidate Profile
Core Experiences
We are looking for candidates with a proven track record of excellence in GPU programming and AI system optimization. You should bring:
Expertise in designing high-performance GeMM CUDA kernels using Tensor cores or CUDA cores, leveraging tools like CuTe or CUTLASS.
Proficiency in extending or writing custom attention and deep learning kernels from scratch.
Confidence in writing both forward and backward kernels while managing floating-point precision errors.
Strong optimization skills for both memory-bound and compute-bound operations.
Advanced knowledge of GPU architecture, including register pressure, shared-memory usage, and GPU utilization.
Preferred Skills
Familiarity with profiling tools (e.g., Nsight) to identify bottlenecks and improve performance.
Experience integrating custom kernels with frameworks like JAX/XLA through tools like pybind11.
Awareness of the latest advancements in GPU optimization techniques for AI workloads.
Why Join AryaXAI?
Mission-Driven Impact: Work on challenges that shape the future of responsible AI.
Technical Excellence: Collaborate with a team of passionate and experienced professionals.
Growth Opportunities: Contribute across domains and expand your expertise in GPU kernel development and AI research.
Flexible Work Environment: Choose between remote work or relocation support to one of our key offices.
Interview Process
Application Review: We review your CV and a statement of exceptional work.
Initial Interview (15 Minutes): A technical team member will evaluate your basic skills and fit for the role.
Main Process
Coding Assessment: Solve programming challenges in your preferred language.
Systems Problem-Solving: A live, hands-on session to showcase practical expertise.
Project Deep-Dive: Present your most notable project to our team.
Team Meet & Greet: Engage with the broader AryaXAI team.
Note: Our interviews are designed to conclude within one week to streamline your onboarding process.
Searching, interviewing and hiring are all part of the professional life. The TALENTMATE Portal idea is to fill and help professionals doing one of them by bringing together the requisites under One Roof. Whether you're hunting for your Next Job Opportunity or Looking for Potential Employers, we're here to lend you a Helping Hand.
Disclaimer: talentmate.com is only a platform to bring jobseekers & employers together.
Applicants
are
advised to research the bonafides of the prospective employer independently. We do NOT
endorse any
requests for money payments and strictly advice against sharing personal or bank related
information. We
also recommend you visit Security Advice for more information. If you suspect any fraud
or
malpractice,
email us at abuse@talentmate.com.
You have successfully saved for this job. Please check
saved
jobs
list
Applied
You have successfully applied for this job. Please check
applied
jobs list
Do you want to share the
link?
Please click any of the below options to share the job
details.
Report this job
Success
Successfully updated
Success
Successfully updated
Thank you
Reported Successfully.
Copied
This job link has been copied to clipboard!
Apply Job
Upload your Profile Picture
Accepted Formats: jpg, png
Upto 2MB in size
Your application for AI Infrastructure Engineer
has been successfully submitted!
To increase your chances of getting shortlisted, we recommend completing your profile.
Employers prioritize candidates with full profiles, and a completed profile could set you apart in the
selection process.
Why complete your profile?
Higher Visibility: Complete profiles are more likely to be viewed by employers.
Better Match: Showcase your skills and experience to improve your fit.
Stand Out: Highlight your full potential to make a stronger impression.
Complete your profile now to give your application the best chance!