Build data products and processes alongside the core engineering and technology team
Collaborate with senior data scientists to curate, wrangle, and prepare data for use in their advanced analytical models
Integrate data from a variety of sources, assuring that they adhere to data quality and accessibility standards
Modify and improve data engineering processes to handle ever larger, more complex, and more types of data sources and pipelines
Use Hadoop architecture and HDFS commands to design and optimize data queries at scale
Evaluate and experiment with Client data engineering tools and advises information technology leads and partners about new capabilities to determine optimal solutions for particular technical problems or designated use cases
Big data engineering skills:
5+ years of hands-on experience in one or more modern Object-Oriented Programming languages (Java, Scala, Python) including the ability to code in more than one programming language.
5+ years of hands-on experience applying principles, best practices, and trade-offs of schema design to different database systems, including relational (Oracle, MSSQL, Postgres, MySQL) and NoSQL (HBase, Cassandra, MongoDB)
2+ years of hands-on experience implementing batch and real-time data integration frameworks and/or applications in private or public cloud environments (AWS, Azure, GCP, etc.) using various technologies (Hadoop, Spark, Impala, etc.), including assessing performance, debugging, and fine-tuning those systems
Deep understanding of the latest data science and data engineering methods and processes to develop impactful and reusable patterns and abstractions from enterprise-level data assets
3+ years of hands-on experience in all phases of data modelling from conceptualization to database optimization
Demonstrated ability to perform the engineering necessary to acquire, ingest, cleanse, integrate, and structure massive volumes of data from multiple sources and systems into enterprise analytics platforms
Proven ability to design and optimize queries to build scalable, modular, efficient data pipelines
Ability to work across structured, semi-structured, and unstructured data, extracting information and identifying linkages across disparate data sets
Proven experience delivering production-ready data engineering solutions, including requirements definition, architecture selection, prototype development, debugging, unit-testing, deployment, support, and maintenance
Ability to operate with a variety of data engineering tools and technologies; vendor agnostic candidates preferred