- Hands-on experience working with Spark based ETL design, implementation, and maintenance.
- 5+ years of experience in Java Apache Spark implementation
- Ingesting and managing data in AWS-based data stores including S3, Glue, RDS and other ETL tools.
- Batch ingestion using big data tools like Spark with Scala.
- Interface with other technology teams to extract, transform, and load data from relational data sources such as PostgreSQL, using SQL and AWS big data technologies.
- Experience in deploying applications using Docker and/or Kubernetes.
- Monitor and control all phases of development process and analysis, design, construction, testing, and implementation as well as
provide user and operational support on applications to business users.
- Collaborate with other teams to recognize and help adopt best practices in reporting and analysis: data integrity/quality,
pipelines analysis, validation, and documentation.
- Experience with web application, API development and use of JSON/XML data formats
- Strong programming skills in Java and good knowledge of Docker, Kubernetes