- Strong SQL Query
- stronger python coding experience.
- Building automated data pipelines
- Performing data analysis and data exploration
- Working in an agile delivery environment
- Be involved in all stages of a data science product development from definition, design, development, and delivery.
- Deliver data pipelines to store data in a way that is accessible, secure, and sustainable
- Develop proofs of concept and evaluate design options to deliver ingestion, search, metadata, and scheduling pipelines
Skill set:
Python , Apache Hadoop , Spark , SQL
Required:
- Strong in SQL and Python, 2+ years with both (preferably)
- Experience building automated data pipelines
- Experience performing data analysis and data exploration
- Experience working in an agile delivery environment
- Strong critical thinking, communication, and problem solving skills
Preferred:
- Experience with big data frameworks (i.e. Hadoop and Spark)
- Experience with cloud based platforms (i.e. Azure, GPC, AWS)
- Experience working in multi-developer environment, using version control (i.e. Git)
- Experience with orchestrating pipelines using tools (i.e. Airflow, Azure Data Factory)