Tagcor

Bigdata Spark Hadoop AWS EMR Athena PySpark

Bigdata, Spark, Hadoop, AWS EMR, Athena, Data Pipeline, PySpark, Java with OO programming skills

Working experience and communicating with business stakeholders and architects
Industry experience in developing relevant big data/ETL data warehouse experience building cloud native data pipelines
Experience in Python, Pyspark, Scala, Java and SQL Strong Object and Functional programming experience in Python
Experience worked with REST and SOAP based APIs to extract data for data pipelines
Extensive experience working with Hadoop and related processing frameworks such as Spark, Hive, Sqoop, etc.
Experience working in a public cloud environment, particularly AWS is mandatory
Ability to implement solutions with AWS Virtual Private Cloud, EC2, AWS Data Pipeline, AWS Cloud Formation, Auto Scaling, AWS Simple Storage Service, EMR and other AWS products, HIVE, Athena
Experience in working with Real time data streams and Kafka Platform.
Working knowledge with workflow orchestration tools like Apache Airflow design and deploy dags.
Hands on experience with performance and scalability tuning
Professional experience in Agile/Scrum application development using JIRA