4+ years proven experience in developing and deploying data pipelines, preferably in the Cloud; Azure and Snowflake experience is a plus.
2+ years of proven expertise in creating pipelines for real time and near real time integration.
Proven experience with Spark SQL, Spark Streaming and using Core Spark API to explore Spark features to build data pipelines.
2+ experience with at least one programming language like Python, Java, or Scala
Hands-on experience in productionizing and deploying Big Data platforms and applications. Hands-on experience working with: Relational/SQL, distributed columnar data stores/NoSQL databases (MongoDB or Cassandra), graph databases, timeseries databases, HDFS, HBase, Map Reduce, NiFi, Spark streaming, Kafka, Sqoop, Hive, Oozie, Avro, and more
Databricks and Delta table knowledge is a plus.
Extensive experience in data transformations for Retail business use cases will be a plus.
Knowledge for handling exceptions and automated re-processing and reconciling
Passion for Data Quality with an ability to integrate these capabilities into the deliverables.
Prior use of Big Data components and the ability to rationalize and align their fit for a business case.
Experience in working with different data sources - flat files, XML, JSON, Avro files and databases.
Proficiency in techniques for slowly changing dimensions.
Knowledge of Jenkins for continuous integration and End-to-End automation for application build and deployments
Ability to integrate into a project team environment and contribute to project planning activities.
Experience in developing implementation plans and schedules and preparing documentation for the jobs according to the business requirements.
Lead ambiguous and complex situations to clear measurable plans.
Proven experience and ability to work with people across the organization and skilled at managing cross-functional relationships and communicating with leadership across multiple organizations.