Tagcor

Required Skills

Hadoop Hbase Admin

Work Authorization

US Citizen
Green Card
EAD (OPT/CPT/GC/H4)
H1B Work Permit

Preferred Employment

Corp-Corp
W2-Permanent
W2-Contract
Contract to Hire

Employment Type

Consulting/Contract

education qualification

UG :- - Not Required
PG :- - Not Required

Other Information

No of position :- ( 1 )
Post :- 21st Oct 2025

JOB DETAIL

• Lead complex technology initiatives towards Hadoop platform stability, automation initiatives.

• Platform Incident and Problem management

• Be part of 24/7 platform support team to perform platform incident management and triaging team.

• Assess the impact of the platform issue and prioritize the triage by partnering with platform tenants, platform development and Hadoop vendor teams.

• Facilitate and perform root cause analysis of platform incidents towards determining permanent resolution involving Tenants and Hadoop vendor.

• Communicate with Stakeholders, leadership team on the triaging progress status.

• To troubleshoot tenant reported failures by deep diving onto client logs to identify the root cause.

• Use Autosys as scheduler to schedule, execute platform routine activities.

• Hadoop Cluster Monitoring

• Monitor Hadoop cluster health and performance.

• Automate new monitoring opportunities leveraging available monitoring and ing tools such as Prometheus, Thousand Eyes, Splunk etc.

• Perform troubleshooting on the actionable s and tuning to ensure optimal cluster performance.

• Hadoop Cluster Configuration and tuning

• Periodically tune and configure Hadoop ecosystem components such as HDFS, YARN, Hive, HBase, Spark, HPOS etc.

• Schedule the Change request and track the approval process for implementation.

• Platform Backup and Recovery:

• Ensure Hadoop platform is resilient with data backup and recovery strategies in place.

• Perform emergency or scheduled failover of platform to Disaster recovery site and fall back.

• Upgrades and Patch Management:

• Plan and execute upgrades of Hadoop ecosystem components and patches.

• Ensure compatibility and smooth integration of new features and enhancements.

• Work with Operating System administrators and patching automation build team to schedule and execute patching.

• Perform Cluster health validation post patching and communicate with stakeholders.

• Identify automation and process improvement opportunities related to patching to implement it.

• Documentation and Reporting:

• Maintain comprehensive documentation of Hadoop cluster configurations & troubleshooting processes, and procedures.

• Develop and Schedule to Generate reports on cluster usage, performance metrics, and capacity utilization.

• Collaboration and Support:

• Ability to work both independently and in collaboration with Platform development and Platform tenant teams to troubleshoot and fix Hadoop platform issues, maintain production stability to enterprise-wide applications.

• Collaborate and consult with key technical experts, senior technology team, and Hadoop vendor to resolve complex technical issues and achieve goals.

Mandatory Skills and Qualifications:

• Proven Experience in technology incident and problem management role.

• Proven Experience in using monitoring tools like Prometheus, Splunk, SiteScope & thousand eyes etc.,

• Proven experience as a Unix Scripting

• Proven experience in Hadoop administration or in a similar role.

• Strong knowledge of Hadoop ecosystem and its components (Hive, HBase, Spark, Hue ).

• Proven Experience is using Autosys or similar job scheduler.

• Excellent problem-solving and troubleshooting skills.

• Excellent communication and people skills for collaborating with cross-functional teams.