Bigdata, Spark, Hadoop, AWS EMR, Athena, Data Pipeline, PySpark, Java with OO programming skills
- Working experience and communicating with business stakeholders and architects
- Industry experience in developing relevant big data/ETL data warehouse experience building cloud native data pipelines
- Experience in Python, Pyspark, Scala, Java and SQL Strong Object and Functional programming experience in Python
- Experience worked with REST and SOAP based APIs to extract data for data pipelines
- Extensive experience working with Hadoop and related processing frameworks such as Spark, Hive, Sqoop, etc.
- Experience working in a public cloud environment, particularly AWS is mandatory
- Ability to implement solutions with AWS Virtual Private Cloud, EC2, AWS Data Pipeline, AWS Cloud Formation, Auto Scaling, AWS Simple Storage Service, EMR and other AWS products, HIVE, Athena
- Experience in working with Real time data streams and Kafka Platform.
- Working knowledge with workflow orchestration tools like Apache Airflow design and deploy dags.
- Hands on experience with performance and scalability tuning
- Professional experience in Agile/Scrum application development using JIRA