Tagcor

AWS GCP Azure

Design, develop, and maintain PySpark jobs for data ingestion, transformation, and analysis
Work with large and complex datasets to extract, clean, and transform data for various analytical purposes
Optimize PySpark applications for performance and scalability
Develop unit tests for Spark transformations and helper methods
Work with data scientists and business analysts to understand data requirements and translate them into technical solutions
Configure and manage PySpark environments on distributed computing clusters (e.g., YARN, Mesos)
Experience with cloud platforms (AWS, GCP, Azure) a plus
Experience with CI/CD pipelines for deploying Spark applications a plus

Qualifications