Primary responsibilities include:
- Build Big data pipelines working with product owners, data architects, data analysts, testers in agile environment
- Analyze data to identify gaps in requirements for data ingestion and provide robust optimal data ingestion solutions to meet the requirements
- Building comprehensive data pipeline solutions to meet the requirements adhering to quality and development standards.
- Articulating the data pipeline solutions to management & team members, documenting as required.
- Collaborating with cross functional team members as necessary and deliver quality code in time
- Executing unit test of data validating expected result and ensure quality & accuracy
- Coordinating with business users and BI developers for User Acceptance Testing
- Work with Operations team for code deployment to production
- Experience in change management procedures and strictly adhere to the compliance and regulatory needs
- Work with team and ensure completion in stipulated timelines
- Help and guide onsite and offshore team as and when requiredRequired Skills/Experience:
- Hands on experience in building batch and streaming data solutions in Cloudera big data platforms in agile environment
- Experience with Hadoop ecosystem (Cloudera Hadoop, Hive, Sqoop, Hue, HQL, Oozie, PySpark)
- Familiarity with RDBMS such as Teradata, Oracle, MS-SQL
- Shell scripting & Python
- Must be a team player and collaborate in a cross functional team environment
- Self-starter and ability to work independently with minimal guidance
- Performance tuning and optimization of code
- Experience in working with large teams and client facing roles
Preferred Skills / Experience:
- Experience in the financial services industry
- Spark streaming, APIs
- Cloud environment experience
- Familiarity with Business Intelligence/Reporting tools such as Tableau
- Experience with handling structured, semi-structured and unstructured data in Hadoop environment
- Familiarity with scheduling tools such as CA7Sensitivity: Internal & Restricted
- Version controlling using GIT /Bit bucket
- Familiarity building CICD pipeline
- Familiarity with Data tokenization methods