Bangalore, Karn\u0101taka, India
1 day ago
Sr. Data Engineer/Tech Lead

At Lilly, we unite caring with discovery to make life better for people around the world. We are a global healthcare leader headquartered in Indianapolis, Indiana. Our employees around the world work to discover and bring life-changing medicines to those who need them, improve the understanding and management of disease, and give back to our communities through philanthropy and volunteerism. We give our best effort to our work, and we put people first. We’re looking for people who are determined to make life better for people around the world.

As a Senior Data Engineer, you will :

demonstrate expert skills in ETL/ELT, data integration, ML Ops, and SQL, as well as intermediate to advanced skills in Python, Pyspark, AI/ML, and data visualization.demonstrate the ability to review, optimize, document, and mentor data/visualization engineers on data pipelines, mapping, cleansing, and visual design using various tools and platforms.possess ability to break down moderately complex problems to implement for increased business impactsupport other team members and helps them to be successful. Actively shares learnings with team membersdrive and enforce the team process improvements, ensuring others are brought along in understanding the benefits and tradeoffsactively promote new and innovative ideas across multiple teams and capabilities

Key Responsibilities

Hands-On Development (75%)

Build, and maintain scalable data platforms and infrastructure on AWSImplement end-to-end data pipelines for batch and real-time data processingBuild robust ETL/ELT workflows to ingest, transform, and load data from diverse sourcesImplement data lake/Lakehouse architectures using AWS S3, Glue, Athena, and Lake FormationDesign and optimize data warehouse solutions (Redshift, Snowflake) for analytics and reportingEstablish data quality frameworks and automated monitoring systemsWrite production-quality Python code for data processing, transformation, and automationBuild scalable data pipelines using Apache Airflow, AWS Step Functions, or similar orchestration toolDevelop streaming data solutions using Kinesis, Kafka, or AWS MSKOptimize SQL queries and database performance for large-scale datasetsImplement data validation, cleansing, and quality checksBuild APIs and microservices for data access and integrationCreate monitoring, alerting, and observability solutions for data pipelinesDebug and resolve data pipeline failures and performance bottlenecks

Technical Leadership & Collaboration (25%)

Mentor junior and mid-level data engineers through code reviews and technical guidanceEstablish best practices for data engineering, testing, and deploymentCollaborate with data scientists, analysts, and business stakeholders to understand data requirementsWork with ML engineers to build data pipelines supporting machine learning workflowsPartner with platform/infrastructure teams on cloud architecture and cost optimizationLead technical design discussions and architectural reviewsDocument data architectures, pipelines, and processesEvangelize data engineering best practices across the organization

Required Qualifications

Technical Expertise

10+ years of professional experience in data engineering or related rolesExpert-level proficiency in Python for data engineering:Data processing libraries: Pandas, PySpark, Dask, PolarsAPI development: FastAPI, FlaskTesting: Pytest, unittestStrong AWS expertise with hands-on experience in:Data Storage: S3, RDS/Aurora, DynamoDB, RedshiftData Processing: Glue (ETL jobs, crawlers, Data Catalog), EMR, AthenaStreaming: Kinesis (Data Streams, Firehose, Analytics), MSK (Managed Kafka)Orchestration: Step Functions, EventBridge, LambdaAnalytics: QuickSight, Athena, Redshift SpectrumData Lake: Lake Formation, Glue Data CatalogInfrastructure: CloudFormation, CDK, IAM, VPC, CloudWatchWorkflow Orchestration:Apache Airflow (strong preference)Big Data Technologies:Apache Spark (PySpark) for distributed data processingExperience with EMR, Databricks, or similar platformsUnderstanding of distributed computing conceptsParquet, Avro, ORC file formats

Architecture & Design

Solid understanding and implementation knowledge of data modelling (dimensional modelling, star/snowflake schemas)Experience with both batch and streaming data processing patternsKnowledge of data lake, data warehouse, and lake-house architecturesUnderstanding of data partitioning, bucketing, and optimization strategiesExpertise in designing for data quality, lineage, and governance

DevOps & Best Practices

Strong experience with CI/CD pipelines for data engineering (GitHub Actions, GitLab CI, Jenkins)Infrastructure as Code using Terraform, CloudFormation, or AWS CDKContainerization with Docker; experience with ECS/Fargate/Kubernetes is a plusGit version control and branching strategiesMonitoring and observability tools: CloudWatch, GrafanaData pipeline testing strategies and frameworks

Preferred Qualifications

Bachelor's or Master's degree in Computer Science, Engineering, Data Science, or related field (or equivalent experience)Experience in regulated industries (healthcare/pharma, finance, government) with compliance requirementsHands-on experience with:Additional AWS services: Glue DataBrew, AppFlow, Data Pipeline, Lambda, SageMakerStreaming platforms: Apache Kafka, Confluent, AWS MSKData quality tools: Great Expectations, dbt, Monte Carlo, BigeyeData cataloging: AWS Glue Data Catalog, Alation, CollibraAlternative clouds: GCP (BigQuery, Dataflow), Azure (Synapse, Data Factory)Data orchestration: dbt for transformation workflowsExperience with clinical data, life sciences, or statistical computing domains (CDISC standards, clinical trials data)Knowledge of data mesh or data fabric architecturesExperience building data platforms for ML/AI workloadsFamiliarity with data governance and metadata management frameworks

Lilly is dedicated to helping individuals with disabilities to actively engage in the workforce, ensuring equal opportunities when vying for positions. If you require accommodation to submit a resume for a position at Lilly, please complete the accommodation request form (https://careers.lilly.com/us/en/workplace-accommodation) for further assistance. Please note this is for individuals to request an accommodation as part of the application process and any other correspondence will not receive a response.

Lilly does not discriminate on the basis of age, race, color, religion, gender, sexual orientation, gender identity, gender expression, national origin, protected veteran status, disability or any other legally protected status.

#WeAreLilly
Confirm your E-mail: Send Email
All Jobs from Eli Lilly