Big Data, Cloudera, Hive, Sqoop, Spark, Impala
- Experience of working on Big Data platforms like Cloudera or Hortonworks
- Hands on experience on tools like Sqoop, Impala, Hive and Spark
- Experience in building data warehouse and data lake applications
- Knowledge of Devops and Agile methodology
- Knowledge of data governance (Data Quality, Data catalog) desirable
- Good communication skills
- Requirement analysis and build source to target data pipelines
- Design and build data pipelines for ingesting data using Sqoop
- Design tables using Hive
- Design views using Hive, Impala
- Build transformation logic using Spark, Hive
- Assist in building the data catalog