Data Ingestion of multiple SOR’s (System of Record) related to Master and Reference Data domain into Hadoop Data Lake – Raw and Sanitized
Understanding of open source ETL frameworks like Apache beam or Java/Python based ETL tools is a plus.
Automate Open source solution for processing within Hive
Stream or Near Real Time processing by consuming data from Kafka topics or Solace Queues
Data ingestion and conformance to an internal, hybrid or public cloud
Work collaboratively with Data Governance and Metadata partners and incorporate the Data Lineage and Meta data requirements using Data Ingestion and Conformance
Provide process analysis and engineering on need basis for the to be automated process
Develop and deploy code in DEV, SIT, UAT and PROD environments
Provide fixes to issues/bugs reported from SIT, UAT and PROD environments