Required Skills

Hadoop

Work Authorization

  • US Citizen

  • Green Card

  • EAD (OPT/CPT/GC/H4)

  • H1B Work Permit

Preferred Employment

  • Corp-Corp

  • W2-Permanent

  • W2-Contract

  • Contract to Hire

Employment Type

  • Consulting/Contract

education qualification

  • UG :- - Not Required

  • PG :- - Not Required

Other Information

  • No of position :- ( 1 )

  • Post :- 6th Feb 2025

JOB DETAIL

-Expertise and knowledge: Cloudera Data Platform, Oozie, Hive, Spark, Spark Streaming and Presto

Data Pipeline Development:

-Design, develop, and implement scalable data pipelines using Cloudera tools like Hadoop, Spark, Hive, Impala, and HDFS.

-Write and optimize ETL processes to extract, transform, and load data into data lakes or warehouses.

-Big Data Application Development:

-Develop applications to process large datasets efficiently using frameworks such as Apache Spark and MapReduce.

-Build solutions for batch and real-time data processing.

-Cluster Management:

-Work with Cloudera Manager for cluster setup, configuration, monitoring, and performance optimization.

-Ensure high availability and scalability of Cloudera clusters.

-System dimensioning (computational resources/Storage/Networks).

-System reconfiguration in case of HW extension and/or replacement.

-OS and Cloudera Software upgrades.

-Cloudera SW vulnerabilities and patching management.

-Access and permission management.

-Installation of any other Cloudera application if needed.

-Data Storage and Management:

-Design and implement data storage strategies using HDFS, HBase, and other Cloudera-supported tools.

-Optimize data storage and retrieval processes to improve performance.

-Performance Tuning:

-Monitor and optimize the performance of Hadoop and Spark jobs.

-Troubleshoot and resolve performance bottlenecks in data pipelines.

-Assist in Designing scalable architectures for high volume data.

-Ensure E2E pipeline stability for already developed and future use cases.

-Performance tuning of Spark workflows.

-Integration and Collaboration:

-Integrate Cloudera solutions with external systems, databases, and APIs.

-Collaborate with data scientists, analysts, and other teams to understand requirements and deliver data solutions.

Company Information