Develops and maintains scalable data pipelines and builds out new API integrations to support continuing increases in data volume and complexity.
Collaborates with analytics and business teams to improve data models that feed business intelligence tools, increasing data accessibility and fostering data-driven decision-making across the organization.
Implements processes and systems to monitor data quality, ensuring production data is always accurate and available for key stakeholders and business processes that depend on it. Writes unit/integration tests, contributes to engineering wiki, and documents work.
AWS Cloud experience
Performs data analysis required to troubleshoot data related issues and assist in the resolution of data issues. Works closely with a team of frontend and backend engineers, product managers, and analysts. Defines company data assets (data models), spark, sparkSQL, and hiveSQL jobs to populate data models. Designs data integrations and data quality framework. Designs and evaluates open source and vendor tools for data lineage. Works closely with all business units and engineering teams to develop strategy for long term data platform architecture.
7+ years of Python or Java development experience 7+ years of SQL experience (No-SQL experience is a plus) 7+ years of experience with schema design and dimensional data modeling.
Ability in managing and communicating data warehouse plans to internal clients Experience designing, building, and maintaining data processing systems Experience working with either a Map Reduce or an MPP system on any size/scale