Tagcor

Data Engineer

Apache Beam / Apache Flume / Kafka / Azure Databricks, Python, Integration, Data Engineer, Azure

8+ years proven experience in developing and deploying data pipelines, preferably in the Cloud; Azure and Snowflake experience is a plus.
2+ years of proven expertise in creating pipelines for real time and near real time integration.
Proven experience with Spark SQL, Spark Streaming and using Core Spark API to explore Spark features to build data pipelines.
2+ experience with at least one technologies – Apache Beam / Apache Flume / Kafka / DataBricks
Experience in Azure
Hands-on experience in productionizing and deploying Big Data platforms and applications. Hands-on experience working with: Relational/SQL, distributed columnar data stores/NoSQL databases (MongoDB or Cassandra), graph databases, timeseries databases, HDFS, HBase, Map Reduce, NiFi, Spark streaming, Kafka, Sqoop, Hive, Oozie, Avro, and more
Databricks and Delta table knowledge is a plus.
Extensive experience in data transformations for Retail business use cases will be a plus.
Knowledge for handling exceptions and automated re-processing and reconciling
Passion for Data Quality with an ability to integrate these capabilities into the deliverables.
Prior use of Big Data components and the ability to rationalize and align their fit for a business case.
Experience in working with different data sources - flat files, XML, JSON, Avro files and databases.
Proficiency in techniques for slowly changing dimensions.
Knowledge of Jenkins for continuous integration and End-to-End automation for application build and deployments