Assemble and manage large, complex sets of data that meet non-functional and functional business requirements
Design and evolve data models and data schema based on business and engineering needs
Build and support ETL pipelines using tools like Hive, Airflow, Flyte, and SQL technologies
Implement systems to track data quality and consistency
Build analytical tools that provide insight into key performance metrics
Work with stakeholders across science, engineering, data infrastructure, product, and our leadership, driving resolving data-related technical problems
Skill Set:
3+ years of experience in software engineering, ideally with a focus on Data Engineering/Architecture
Ability to work with complex production-quality software
Technical expertise in data infrastructure, cloud computing, storage systems, and distributed computing frameworks in multi-petabyte scale systems
Knowledge of modern data/computing infrastructure and frameworks.
Today we use S3, DynamoDB, Kafka, ElasticSearch, Spark, Flyte, Stackdriver, and SageMaker. We do not expect you to know them all but would like for you to be familiar with some
Openness to new or different ideas, and the ability to evaluate multiple approaches and choose the best one based on fundamental qualities and supporting data
Ability to communicate highly technical problems working along our cross-functional team
Ability to communicate in English in various forms e.g technical document, meetings, presentation