Back to listing

Research Intern Reinforcement Learning RL

Talentmate

India

2nd May 2026

2605-7825-17

Job Description

🚀 Build the next generation of Agentic AI with us
Our platform combines conversation intelligence, multimodal understanding, and agentic AI systems to power both human agents and autonomous AI agents across the entire customer experience lifecycle.

A core part of this vision is our investment in custom Small Language Models (SLMs)—purpose-built for CX workflows—paired with reinforcement learning systems that continuously improve decision-making in real-world environments.

We’re looking for a Research Intern (Reinforcement Learning) to join us in shaping this future.

What You’ll Do

Design and build reinforcement learning environments that model real-world customer interaction workflows.
Design RL agents that learn from these environments using real-world interaction data, rewards, and feedback loops
Define reward models and feedback loops using real-world signals (outcomes and human feedback)
Enable learning from production data by structuring interaction traces into training-ready datasets for offline and online learning
Experiment with multi-agent systems and simulation frameworks for complex coordination and decision-making
Collaborate with engineering and product teams to deploy, evaluate, and iterate on learning systems in production at scale.

What We’re Looking For

Currently pursuing (or recently completed) a degree in Computer Science, AI, Machine Learning, or related field
Strong understanding of reinforcement learning fundamentals
Familiarity with RL environments and training libraries such as Verl and Tinker
Strong foundation in probability, maths, and optimization
Passion for building real-world AI systems

Nice to have

Experience with RLHF, LLM/SLM fine-tuning, or model alignment
Exposure to agent-based systems or multi-agent RL
Prior research, projects, or publications in RL or applied ML
Experience working with large-scale or production datasets

Why Level AI

Work on production-grade Agentic AI systems used by leading enterprises
Build alongside a team with deep expertise from Amazon, Google, and Meta
Be part of a fast-growing Series C AI company.
Direct exposure to 0→1 AI innovation in CX and decisioning systems

We may use artificial intelligence (AI) tools to support parts of the hiring process, such as reviewing applications, analyzing resumes, or assessing responses. These tools assist our recruitment team but do not replace human judgment. Final hiring decisions are ultimately made by humans. If you would like more information about how your data is processed, please contact us.

Job Details

Role Level:	Not Applicable	Work Type:	Full-Time
Country:	India	City:	Noida ,Uttar Pradesh
Company Website:	https://thelevel.ai/	Job Function:	Data Science & AI
Company Industry/ Sector:	Transportation Logistics Supply Chain and Storage

What We Offer

About the Company

Searching, interviewing and hiring are all part of the professional life. The TALENTMATE Portal idea is to fill and help professionals doing one of them by bringing together the requisites under One Roof. Whether you're hunting for your Next Job Opportunity or Looking for Potential Employers, we're here to lend you a Helping Hand.

Report

Disclaimer: talentmate.com is only a platform to bring jobseekers & employers together. Applicants are advised to research the bonafides of the prospective employer independently. We do NOT endorse any requests for money payments and strictly advice against sharing personal or bank related information. We also recommend you visit Security Advice for more information. If you suspect any fraud or malpractice, email us at abuse@talentmate.com.