We are looking for a hardworking, aspirational and innovative engineering leader for the Lead Data Scientist position in our AI engineering and innovation team. The Lead Data Scientist will play a diverse and far-reaching role across organizations providing leadership and influencing adoption of technical solutions, strategies and design patters across multiple teams and partners within Kimberly-Clark.
This team is mainly responsible for the prototyping, development, interpretation and proving out business value. We are looking for a highly motivated and qualified Data Scientist to help drive our AI Innovation initiatives. This role is ideal for candidates with strong hands-on skills in algorithm selection, feature identification and optimization, and model validation and efficacy.
Responsibilities:
- Understand the business problem, analyse the data, and define the success criteria
- Work with engineering team and architecture teams for data identification and collection, harmonization, and cleansing for the data analysis and preparation
- Responsible for analysing and identifying appropriate algorithms for the defined problem statement
- Analyse additional data inputs and methods that would improve the results of the models and look for opportunities
- Responsible for building models that are interpretable, explainable and sustainable at scale and meets the business needs.
- Build visualizations and demonstrate the results of the model to the stakeholders and leadership team
- Must be conversant with Agile methodologies and tools and have a track record of delivering products in a production environment.
- Lead the design of prototypes in our AI factory, partnering with product teams, AI strategists, and other stakeholders throughout the AI development life cycle.
- Lead and transform data science prototypes
- Mentor a diverse team of junior engineers in machine learning techniques, tools and concepts. Provides guidance and leadership to more junior engineers.
- Explore and recommend new tools and processes which can be leveraged across the data preparation pipeline for capabilities and efficiencies.
- Ensure that our development and deployment are tightly integrated to each other to maximize the deployment user experience.
- Curator for all code and binary artifact repositories (containers, compiled code).
- Work with AI strategists, DevOps, data engineers/SMEs from domain to understand how data availability and quality affects model performance.
- Evaluate open source and proprietary technologies and present recommendations to automate machine learning workflows, model training and versioned experimentation, digital feedback and monitoring.
- Develop and disseminate innovative techniques, processes and tools, that can be leveraged across the AI product development lifecycle.
Qualifications:
- Experience building ML models in a modern cloud-based architecture
- 12+ years of experience and 5+ years of demonstrated experience in developing highly scalable, reliable, and resilient multi-tenanted ML algorithms for large scale use cases in Sales and Marketing, Revenue management, Supply Chain and other business areas.
- 5+ years of demonstrated experience in developing ML pipelines on various frameworks on AWS, Azure, or similar cloud platforms.
- Proficient and experienced in Python and SQL for data analysis and exploration
- Experience in cloud based solutioning and managing enterprise grade end to end machine learning solutions with automated pipelines for data processing, feature engineering, training, evaluation, deployment, integration and monitoring.
- Hands-on experience with Docker, Kubernetes and the cloud infra like Azure, AWS, GCP and on machine learning tools like Azure Machine Learning, Amazon Sagemaker, MLFlow, KubeFlow, etc in production.
- Experience in building end to end Machine Learning Architectures.
- Strong knowledge in one of machine learning design principles, ML Ops best practices or Big Data architectures.
- Experience in end-to-end AI life cycle including Data science, technical experience in AI, machine learning, predictive modelling, Natural Language Processing (NLP), Deep learning, advanced analytics and statistics modelling, Python, SQL, Azure/AWS/GCP
- Experience on model monitoring, explainability, model management, version tracking storage and AI governance
- Deployment of models, Docker, ML Pipelines, Azure Machine Learning
- Knowledge on SQL/NoSQL databases, microservices and REST APIs, docker
- Strong Knowledge on source code management, configuration management, CI/CD, security and performance.
- Ability to look ahead to identify opportunities and thrive in a culture of innovation
- Self-starter who can see the big picture, and prioritize your work to make the largest impact on the business and customer s vision and requirements
- Experience in building, testing, and deploying code to run on Azure cloud datalake
- Ability to Lead/nurture/mentor others in the team.
- A can-do attitude in anticipating and resolving problems to help your team to achieve its goals.
- Must have experience in Agile development methods