We are looking for a Lead Data Engineer to join our Med Tech Content Engineering Team in India. This is an amazing opportunity to work on Life Sciences and Healthcare domain using big data technologies. We would love to speak with you if you have skills in Python, Spark and have experience on building big data platforms.
About You – experience, education, skills, and accomplishments
Bachelor’s Degree or equivalent in computer science, software engineering, or a related field
5+ years of relevant experience.
Good experience working with Python, PySpark, AWS, AWS Glue, EMR and Delta Lake.
Good knowledge of ETL, including the ability to read and write efficient, robust code, follow or implement best practices and coding standards, design/implement common ETL strategies (CDC, SCD, etc.), and create reusable/maintainable jobs.
Solid background in database systems (such as Postgres, Oracle, Snowflake/Databricks) along with strong knowledge of PL/SQL and SQL.
Experience in handling large volume of data and building data pipelines.
Possess good knowledge of Agile/other SDLC methodologies.
Exposure to a Data warehouse / BI project in a Healthcare Domain.
Strong oral and written communication skills.
It would be great if you also had . . .
Familiarity with Airflow, Snowflake, Databricks would be added advantage.
Experience in building big data platforms.
Understanding on healthcare data.
What will you be doing in this role?
As a member of Data Engineering Team, you’ll Step into a key role on an expanding data engineering team to build our data platforms, data pipelines, and data transformation capabilities.
Define and implement our data platform strategy on Cloud, have a meaningful impact on our customers, and working in our high energy, innovative, fast-paced Agile culture.
Drive rapid prototyping and development with Product and Technical teams in building and scaling high-value medical data capabilities.
Interface with other technology teams to extract, transform, and load data from a wide variety of data sources using Apache suite (airflow, spark), SQL, Python, ETL, and AWS big data technologies.
Creation and support of batch and real-time data pipelines and ongoing data monitoring and validation built on AWS/Snowflake/Apache technologies for medical data from many different sources.
Conduct functional and non-functional testing, writing test scenarios and test scripts.
Evaluate existing applications to update and add new features to meet business requirements.
Product you will be developing
You will be contributing to the HIDA platform, which provides healthcare distributors, manufacturers, and partners with critical insights into sales data, market trends, and product distribution. The product helps clients streamline operations, optimize decision-making, and improve overall business outcomes by providing data-driven analytics and reporting capabilities.
About the Team
The HIDA Technology team is a high-performing group of 8 engineers based in Bangalore, working closely with global stakeholders. The team specializes in UI development (Angular), backend services (Python), and advanced analytics powered by Snowflake. We collaborate with product managers, data engineers, and business analysts to deliver impactful solutions. Our tech stack includes Angular, Python (Flask/Django), Snowflake, PostgreSQL, Airflow, and cloud (AWS/Azure). We foster a collaborative, agile, and innovation-driven culture with opportunities to grow skills across the stack.
Hours of Work
Standard working hours aligned with India Standard Time (IST). This is a full-time, permanent position. Flexibility may be required to collaborate with global stakeholders in different time zone
At Clarivate, we are committed to providing equal employment opportunities for all qualified persons with respect to hiring, compensation, promotion, training, and other terms, conditions, and privileges of employment. We comply with applicable laws and regulations governing non-discrimination in all locations.