Design and build data structures on Azure Data Lake and data processing using pySpark, Hive to provide efficient reporting and analytics capability.
Lead the design for database and data pipeline using existing and emerging technologies to help improve decision making
Manage project timelines and provide timely updates on significant issues or developments
Bring structure to large quantities of data to make analysis possible, extract meaning from data
Experience:
Overall, 8-10 years of experience with at least 4-6 years of experience in leading Data Engineering projects as Technical Lead and Senior Developer using pySpark framework
At least 2-3 full end to end data engineering projects implementation experience
Prior experience in Healthcare, Pharma, Life Sciences industry will be a plus
Good understanding of life science data sources (IQVIA/IMS, Sales data) will be preferred
Experience with big data platforms and tools (e.g., pySpark)
Sound knowledge of Data Lake concepts preferable ADLS
Up to date with the latest tools and techniques in data science and Proficiency in SQL, Python
Good communication and presentation skill, ability to work in matrix environment