Back to listing

Sr Data Engineer Tech Lead

Talentmate

India

8th December 2025

2512-3054-162

Job Description

At Lilly, we unite caring with discovery to make life better for people around the world. We are a global healthcare leader headquartered in Indianapolis, Indiana. Our employees around the world work to discover and bring life-changing medicines to those who need them, improve the understanding and management of disease, and give back to our communities through philanthropy and volunteerism. We give our best effort to our work, and we put people first. We’re looking for people who are determined to make life better for people around the world.

As a Senior Data Engineer, You Will

demonstrate expert skills in ETL/ELT, data integration, ML Ops, and SQL, as well as intermediate to advanced skills in Python, Pyspark, AI/ML, and data visualization.
demonstrate the ability to review, optimize, document, and mentor data/visualization engineers on data pipelines, mapping, cleansing, and visual design using various tools and platforms.
possess ability to break down moderately complex problems to implement for increased business impact
support other team members and helps them to be successful. Actively shares learnings with team members
drive and enforce the team process improvements, ensuring others are brought along in understanding the benefits and tradeoffs
actively promote new and innovative ideas across multiple teams and capabilities

Key Responsibilities

Hands-On Development (75%)

Build, and maintain scalable data platforms and infrastructure on AWS
Implement end-to-end data pipelines for batch and real-time data processing
Build robust ETL/ELT workflows to ingest, transform, and load data from diverse sources
Implement data lake/Lakehouse architectures using AWS S3, Glue, Athena, and Lake Formation
Design and optimize data warehouse solutions (Redshift, Snowflake) for analytics and reporting
Establish data quality frameworks and automated monitoring systems
Write production-quality Python code for data processing, transformation, and automation
Build scalable data pipelines using Apache Airflow, AWS Step Functions, or similar orchestration tool
Develop streaming data solutions using Kinesis, Kafka, or AWS MSK
Optimize SQL queries and database performance for large-scale datasets
Implement data validation, cleansing, and quality checks
Build APIs and microservices for data access and integration
Create monitoring, alerting, and observability solutions for data pipelines
Debug and resolve data pipeline failures and performance bottlenecks

Technical Leadership & Collaboration (25%)

Mentor junior and mid-level data engineers through code reviews and technical guidance
Establish best practices for data engineering, testing, and deployment
Collaborate with data scientists, analysts, and business stakeholders to understand data requirements
Work with ML engineers to build data pipelines supporting machine learning workflows
Partner with platform/infrastructure teams on cloud architecture and cost optimization
Lead technical design discussions and architectural reviews
Document data architectures, pipelines, and processes
Evangelize data engineering best practices across the organization

Required Qualifications

Technical Expertise

10+ years of professional experience in data engineering or related roles
Expert-level proficiency in Python for data engineering:

Data processing libraries: Pandas, PySpark, Dask, Polars
API development: FastAPI, Flask
Testing: Pytest, unittest

Strong AWS expertise with hands-on experience in:

Data Storage: S3, RDS/Aurora, DynamoDB, Redshift
Data Processing: Glue (ETL jobs, crawlers, Data Catalog), EMR, Athena
Streaming: Kinesis (Data Streams, Firehose, Analytics), MSK (Managed Kafka)
Orchestration: Step Functions, EventBridge, Lambda
Analytics: QuickSight, Athena, Redshift Spectrum
Data Lake: Lake Formation, Glue Data Catalog
Infrastructure: CloudFormation, CDK, IAM, VPC, CloudWatch

Workflow Orchestration:

Apache Airflow (strong preference)

Big Data Technologies:

Apache Spark (PySpark) for distributed data processing
Experience with EMR, Databricks, or similar platforms
Understanding of distributed computing concepts
Parquet, Avro, ORC file formats

Architecture & Design

Solid understanding and implementation knowledge of data modelling (dimensional modelling, star/snowflake schemas)
Experience with both batch and streaming data processing patterns
Knowledge of data lake, data warehouse, and lake-house architectures
Understanding of data partitioning, bucketing, and optimization strategies
Expertise in designing for data quality, lineage, and governance

DevOps & Best Practices

Strong experience with CI/CD pipelines for data engineering (GitHub Actions, GitLab CI, Jenkins)
Infrastructure as Code using Terraform, CloudFormation, or AWS CDK
Containerization with Docker; experience with ECS/Fargate/Kubernetes is a plus
Git version control and branching strategies
Monitoring and observability tools: CloudWatch, Grafana
Data pipeline testing strategies and frameworks

Preferred Qualifications

Bachelors or Masters degree in Computer Science, Engineering, Data Science, or related field (or equivalent experience)
Experience in regulated industries (healthcare/pharma, finance, government) with compliance requirements
Hands-on experience with:

Additional AWS services: Glue DataBrew, AppFlow, Data Pipeline, Lambda, SageMaker
Streaming platforms: Apache Kafka, Confluent, AWS MSK
Data quality tools: Great Expectations, dbt, Monte Carlo, Bigeye
Data cataloging: AWS Glue Data Catalog, Alation, Collibra
Alternative clouds: GCP (BigQuery, Dataflow), Azure (Synapse, Data Factory)
Data orchestration: dbt for transformation workflows

Experience with clinical data, life sciences, or statistical computing domains (CDISC standards, clinical trials data)
Knowledge of data mesh or data fabric architectures
Experience building data platforms for ML/AI workloads
Familiarity with data governance and metadata management frameworks

Lilly is dedicated to helping individuals with disabilities to actively engage in the workforce, ensuring equal opportunities when vying for positions. If you require accommodation to submit a resume for a position at Lilly, please complete the accommodation request form (https://careers.lilly.com/us/en/workplace-accommodation) for further assistance. Please note this is for individuals to request an accommodation as part of the application process and any other correspondence will not receive a response.

Lilly does not discriminate on the basis of age, race, color, religion, gender, sexual orientation, gender identity, gender expression, national origin, protected veteran status, disability or any other legally protected status.

#WeAreLilly

Job Details

Role Level:	Mid-Level	Work Type:	Full-Time
Country:	India	City:	Bengaluru ,Karnataka
Company Website:	http://www.lilly.com/	Job Function:	Information Technology (IT)
Company Industry/ Sector:	Pharmaceutical Manufacturing

What We Offer

About the Company

Searching, interviewing and hiring are all part of the professional life. The TALENTMATE Portal idea is to fill and help professionals doing one of them by bringing together the requisites under One Roof. Whether you're hunting for your Next Job Opportunity or Looking for Potential Employers, we're here to lend you a Helping Hand.

Report

Disclaimer: talentmate.com is only a platform to bring jobseekers & employers together. Applicants are advised to research the bonafides of the prospective employer independently. We do NOT endorse any requests for money payments and strictly advice against sharing personal or bank related information. We also recommend you visit Security Advice for more information. If you suspect any fraud or malpractice, email us at abuse@talentmate.com.