Build and deliver projects with high impact/complexity and collaboration with cross-functional teams, both business and technology
Develop automated methods for ingesting large datasets into an enterprise-scale analytical system using Sqoop, Spark and Kafka
Build processes supporting optimal data transformation, data structures, metadata and data flow pipelines to consume data from source systems
Assemble large, complex data sets that meet business requirements
Identify technical implementation options and issues
Support and maintain production data workflows, perform root cause analysis and debug issues
Mentor consumers of the data and analytics teams
Provide technical coaching to other engineers and partners across the organization
Partner and communicate cross-functionally across the enterprise
Provide expertise for the specification, development, implementation and support of Analytical solutions (e.g. Datameer, Hive, Python, Redshift, Tableau)
Explain technical solutions and issues in non-technical, understandable terms
Interact with business teams to gather, interpret and understand their business needs and create design specifications
Foster the continuous evolution of best practices within the development team to ensure data quality, standardization and consistency
Continuously seek Data Lake platform and data ingestion improvement opportunities that ensure optimal use of the environment and improve stability
Required Skills:
5+ years of professional experience
4+ years of hands on experience with Big Data, Analytics and AWS Cloud Computing technologies
4+ years of hands on experience with ETL and ELT data modeling
Experience with Big Data tools (e.g., Hadoop, Sqoop, Spark, Kafka, NoSQL DBs (HBase, Cassandra)) and API development and consumption
Proficient in query languages such as SQL, Hive
Experience working with traditional warehouse and correlation into Hive warehouse on Big Data technologies
Experience with setting data modeling standards in Hive
Experience with streaming stacks (e.g., NiFi, PySpark)
Experience with data preparation and manipulation tools (e.g., Datameer)
Knowledge of SOA, IaaS, and Cloud Computing technologies, particularly in the AWS environment
Knowledge of setting standards around data dictionary and tagging data assets within Data Lake for business consumption
Proficient in one or more languages (e.g., Python, Spark, Java, Groovy)
Experience with data visualization tools (e.g., Tableau)
Experience with Jenkins, GIT, and Airflow
Experience with continuous software integration, test and deployment
Experience with agile software development paradigm (e.g., Scrum, Kanban)
Ability to work within a dynamic programmatic environment with evolving requirements and capability goals
Strong customer service and communication skills
Ability to collaborate and share knowledge with diverse team of individuals across Technology and Business
Self-motivated. Capable of working with little or no supervision.