The Data Scientist position is within the US Informatics Architecture and Emerging Technologies team supporting Patient Engagement Network. Person will be responsible for working closely with partners across the US Commercial, Medical and Government Organization and within IT. This person will be supporting exciting projects and proof of concepts in the US Commercial, Medical and Government Affairs portfolio and will be involved in data analysis, visualization and advanced analytics, machine learning and artificial intelligence, as needed. The ideal candidate will have a passion for innovation and data, have strong analytical skills and enjoy solving complex problems.
Responsibilities
- Identify solution needs & recommend data solutions: Ask the right questions, understand the solution needs for commercial and medical/government affairs and ideate and make recommendations on fit-for-purpose data and advanced analytics solutions (Ex: Insights based on sentiments from employees, customers or patients, predictive analytics based on evolving market patterns, Pre-trained NLP Models like ULMFiT, ELMo, BERT for text mining). Develop processes and tools to monitor and analyze model performance and data accuracy.
- Dive into data: Develop a comprehensive and deep understanding of the data we work with and foster learning with colleagues using analytical tools and applications to broaden data accessibility and advance our proficiency/efficiency in understanding and using the data appropriately. (Ex: identifying patterns with adverse events reported for certain drugs, predicting study enrollment, study success using advanced technologies like machine learning and text mining.)
- Be an expert in applying methods: Stay current with and adopt emergent analytical methodologies, tools and applications to ensure fit-for-purpose and impactful approaches. (Ex: Build and deploy ML models to identify potential adverse events and/or medically relevant information; create statistical models to predict study enrollment and proactively identify sites that may not meet enrollment guidelines)
- Collaborate & shape: Collaborate and contribute to functional, cross-functional, enterprise-wide or external data science communities, networks, , initiatives or goals on knowledge-sharing, methodologies, innovations, technology, processes, etc. to enable broader and more effective use of data and analytics to support business.
- Solve problems through innovation: Perform sentiment analysis based on feedback data from sources like text, video & voice using deep learning models. Recognize, identify, and raise awareness of data anomalies. Uncover patterns in the data from which predictive models, impactful insights or solutions can be developed.
- Masters or higher pursuing a graduate degree in Data Science, Computer Science or equivalent
- Strong statistical and mathematical knowledge
- Working knowledge of Python, Cloudera. Tableau. Experience with conventional machine learning techniques and coding packages
- SQL proficiency and experience working with relational databases. (Hadoop, Snowflake, Teradata is a plus)
- Technical competency in analytical tools such as Python/Scala/R, as well as related statistical and machine learning packages.
- Experience with deep learning development is a plus (e.g. IBM Watson, Tensorflow, Theano, etc.)
- Possess a continuous improvement process (CIP) mentality looking for ways to improve the predictive model
- Experience in establishing a ML Data pipeline, deploying & maintaining Models in the Cloud(AWS & Azure is a plus) and On-Prem Data centers.
- Experience with data science best practices. Familiar with version control via Git or similar. Gitlab is a plus.
- Experience with CI/CD process. Knowledge of Terraform is a plus.
- Good presentation skills.