Overall 10 + years of experience with 5+ Years of Big Data Admin/development experience
Installing / Deploying a Hadoop cluster, maintaining a Hadoop cluster, adding and removing nodes using cluster monitoring tools like Ganglia Nagios or Cloudera Manager,
Responsible for capacity planning and estimating the requirements for lowering or increasing the capacity of the Hadoop cluster.
Implementing various security layers to protect system and data – Kerberos, AD, Encryption, RBAC, ACL etc.
Configuring High availability for the NameNode and other Hadoop components
Build backup solutions – perform back and recovery tasks as required
Monitoring, troubleshoot and tune the platform resources, connectivity, Hadoop jobs and dependent Infra for high performance and availability
Design and implement Hive, HDFS, KAFKA and SPARK; Ability to design and implement end to end solution.
Build libraries, user defined functions, and frameworks around Hadoop
Exposure to cloud platform such as AWS, Azure or equivalent is desired.
Develop user defined functions to provide custom hive, HDFS, Kafka and SPARK capabilities
Define and build data acquisitions and consumption strategies
Strong understanding of Hadoop internals
Experience with open source NOSQL technologies such as HBase and Cassandra is advantage
Experience with messaging & complex event processing systems such as Kafka and Storm is advantage