- Setup all the configurations- Core-Site, HDFS-Site, YARN-Site and Map Red-Site. However, when working with popular hadoop distribution like Hortonworks, Cloudera or MapR the configuration files are set up on startup and the hadoop admin need not configure them manually.
- Performing all phases of software engineering including requirements analysis, application design, and code development and testing.
- Responsible for capacity planning and estimating the requirements for lowering or increasing the capacity of the hadoop cluster.
- Responsible for deciding the size of the hadoop cluster based on the data to be stored in HDFS.
- Ensures that the hadoop cluster is up and running all the time.
- Monitoring the cluster connectivity and performance.
- Manage and review Hadoop log files.
- Backup and recovery tasks
- Resource and security management
- Troubleshooting application errors and ensuring that they do not occur again.
- Deploys Hadoos cluster, maintains clusters and manages to add or remove nodes using cluster monitoring (Cloudera Manager) configuring the NameNode high availability and keeping a track of all the running hadoop jobs.
- Implement, manage and administer the overall hadoop infrastructure.
- Takes care of the day-to-day running of Hadoop clusters
Required knowledge and experience:
- 7+ years of overall experience in Information Technology and Systems (Sr. or Lead)
- Excellent knowledge of Development in Hadoop, Spark and Cloudera
- 5+ years of practical experience on enterprise platforms
- Experience with multiple large scale Enterprise Hadoop environment builds and operations including design, capacity planning, cluster setup, security, performance tuning and monitoring.
- Experience with the full Cloudera CDH distribution to install, configure and monitor all services in the CDH stack.
- Strong understanding of core Big Data Cloudera Hadoop services such as HDFS, MapReduce, Kafka, Spark and Spark-Streaming, Hive, Impala, HBASE, Kudu, Sqoop, and Oozie.
- Strong understanding of Python, Java, and Scala language.
- Expertise in typical system administration and programming skills such as storage capacity management, debugging, performance tuning.
- Proficient operating systems (Linux), servers and shell scripting (e.g. Bash,ksh,etc.)
Skills Needed:
- Excellent leadership skills.
- Must have initiative and be a self - starter.
- Problem-Solving Skills and technical understanding to determine the causes of operating errors and be able to make decisions accordingly.
- Excellent communication skills. Being assertive to convey information effectively.
- Effective Time Management.
Studies:
- Bachelor's degree in the field of Information Systems, Computer Science, Data Analytics, Physical Science, or related field.
- Hadoop Certifications (Desirable)