Must Have:
Apache Spark Spark and Scala must have Data Processing Technologies
Responsibilities:
- Design, develop, and maintain scalable and efficient data processing pipelines using Apache Spark and Scala.
- Collaborate with cross-functional teams to understand data requirements and implement robust solutions.
- Optimize and tune Spark applications for performance and efficiency.
- Implement data quality and validation processes to ensure accuracy and reliability.
- Work closely with data scientists and analysts to support their data processing needs.
- Stay updated on emerging technologies in the Big Data ecosystem and recommend improvements to existing systems.
Requirements:
- Bachelor’s degree in computer science, Engineering, or related field.
- Proven experience in designing and implementing data processing solutions using Apache Spark and Scala.
- Strong programming skills in Scala and proficiency in Spark RDDs, Data Frames, and Spark SQL.
- Experience with big data technologies such as Hadoop, Hive, and HDFS.
- Familiarity with cloud platforms like AWS, Azure, or GCP.
- Solid understanding of data modelling and ETL processes.
- Excellent problem-solving and communication skills.
- Ability to work in a fast-paced, collaborative environment.
Preferred Qualifications:
- Bachelor’s /Master’s degree in computer science or related field.
- Certification in Apache Spark or related technologies.
- Experience with real-time data processing frameworks like Apache Flink or Kafka.
- Knowledge of machine learning frameworks such as TensorFlow or PyTorch.