-
US Citizen
-
Green Card
-
EAD (OPT/CPT/GC/H4)
-
H1B Work Permit
-
Corp-Corp
-
W2-Permanent
-
W2-Contract
-
Contract to Hire
-
UG :- - Not Required
-
PG :- - Not Required
-
No of position :- ( 1 )
-
Post :- 1st Jul 2024
- Design, develop, and maintain PySpark jobs for data ingestion, transformation, and analysis
- Work with large and complex datasets to extract, clean, and transform data for various analytical purposes
- Optimize PySpark applications for performance and scalability
- Develop unit tests for Spark transformations and helper methods
- Work with data scientists and business analysts to understand data requirements and translate them into technical solutions
- Configure and manage PySpark environments on distributed computing clusters (e.g., YARN, Mesos)
- Experience with cloud platforms (AWS, GCP, Azure) a plus
- Experience with CI/CD pipelines for deploying Spark applications a plus
Qualifications
- 3+ years of experience in Big Datab
- Strong proficiency in Python and PySpark
- Solid understanding of distributed computing concepts (MapReduce, Spark)
- Experience with Apache Spark ecosystem (SQL, Streaming, MLlib)
- Excellent problem-solving and analytical skills
- Strong communication and collaboration skills