Tagcor

Data Engineer

Work closely with Engineering stakeholders to build and upgrade data pipelines for MICP (Multi-

Item-Checkout). Candidates will also work on deprecating legacy data pipelines and replacing

them with new ones. This includes identifying the downstream pipelines consuming legacy data,

modifying the fields in the certified data if needed, and validating the data consistency after the

Migration.

Minimum Requirements:

• 5 years experience building scalable Spark data pipelines (preferably using Scala)

• 3-5 years experience in high level programming languages such as Java, Scala, or Python

• Proficiency in Spark/MapReduce development and expertise with data processing (ETL)

technologies to build and deploy production-quality ETL pipelines

• Good understanding of distributed storage and compute (S3, Hive, Spark)

• Experience using ETL framework (ex: Airflow, Flume, Oozie etc.) to build and deploy

production-quality ETL pipelines

• Demonstrated ability to analyze large data sets to identify gaps and inconsistencies,

provide data insights, and advance effective product solutions

• Working knowledge of relational databases and expertise in query authoring (SQL) on

large datasets

• Experienced with big data technologies such as Hadoop, Spark, Hive, etc.

• Experience working with Git and Jira (or other source control and task management

tools)

• Good communication skills that allows smooth collaboration with stakeholders