- 5+ years of professional experience
- 4+ years of hands on experience with Big Data, Analytics and AWS Cloud Computing technologies
- 4+ years of hands on experience with ETL and ELT data modeling
- Experience with Big Data tools (e.g., Hadoop, Sqoop, Spark, Kafka, NoSQL DBs (HBase, Cassandra)) and API development and consumption
- Proficient in query languages such as SQL, Hive
- Experience working with traditional warehouse and correlation into Hive warehouse on Big Data technologies
- Experience with setting data modeling standards in Hive
- Experience with streaming stacks (e.g., NiFi, PySpark)
- Experience with data preparation and manipulation tools (e.g., Datameer)
- Knowledge of SOA, IaaS, and Cloud Computing technologies, particularly in the AWS environment
- Knowledge of setting standards around data dictionary and tagging data assets within Data Lake for business consumption
- Proficient in one or more languages (e.g., Python, Spark, Java, Groovy)
- Experience with data visualization tools (e.g., Tableau)
- Experience with Jenkins, GIT, and Airflow
- Experience with continuous software integration, test and deployment
- Experience with agile software development paradigm (e.g., Scrum, Kanban)
- Ability to work within a dynamic programmatic environment with evolving requirements and capability goals
- Strong customer service and communication skills
- Ability to collaborate and share knowledge with diverse team of individuals across Technology and Business
- Self-motivated. Capable of working with little or no supervision.
Primary Responsibilities:
- Build and deliver projects with high impact/complexity and collaboration with cross-functional teams, both business and technology
· Develop automated methods for ingesting large datasets into an enterprise-scale analytical system using Sqoop, Spark and Kafka
- Build processes supporting optimal data transformation, data structures, metadata and data flow pipelines to consume data from source systems
- Assemble large, complex data sets that meet business requirements
· Identify technical implementation options and issues
- Support and maintain production data workflows, perform root cause analysis and debug issues
- Mentor consumers of the data and analytics teams
- Provide technical coaching to other engineers and partners across the organization
- Partner and communicate cross-functionally across the enterprise
- Provide expertise for the specification, development, implementation and support of Analytical solutions (e.g. Datameer, Hive, Python, Redshift, Tableau)
- Explain technical solutions and issues in non-technical, understandable terms
- Interact with business teams to gather, interpret and understand their business needs and create design specifications
- Foster the continuous evolution of best practices within the development team to ensure data quality, standardization and consistency
- Continuously seek Data Lake platform and data ingestion improvement opportunities that ensure optimal use of the environment and improve stability