Tagcor

Required Skills

Big Data Engineer

Work Authorization

US Citizen
Green Card
EAD (OPT/CPT/GC/H4)
H1B Work Permit

Preferred Employment

Corp-Corp
W2-Permanent
W2-Contract
Contract to Hire

Employment Type

Consulting/Contract

education qualification

UG :- - Not Required
PG :- - Not Required

Other Information

No of position :- ( 1 )
Post :- 11th Apr 2023

JOB DETAIL

We have a need for hiring a Sr Big Data Engineer, someone who is proficient in SQL, Pyspark (Databricks) and Airflow

The data engineering role is a team member that will help enhance and maintain the Instant Ink Business Intelligence system. You will drive work you're doing to completion with hands-on development responsibilities, and partner with the Data Engineering leaders to implement data engineering pipelines to build solution to help provide trusted and reliable data to customers.

Responsibilities

Design and implement distributed data processing pipelines using Spark, Python, SQL and other tools and languages prevalent in the Big Data/Lakehouse ecosystem.

Analyzes design and determines coding, programming, and integration activities required based on general objectives.

Reviews and evaluates designs and project activities for compliance with architecture, security and quality guidelines and standards

Writes and executes complete testing plans, protocols, and documentation for assigned portion of data system or component; identifies defects and creates solutions for issues with code and integration into data system architecture.

Collaborates and communicates with project team regarding project progress and issue resolution.

Works with the data engineering team for all phases of larger and more-complex development projects and engages with external users on business and technical requirements.

Collaborates with peers, engineers, data scientists and project team.

Typically interacts with high-level Individual Contributors, Managers and Program Teams on a daily/weekly basis.

What you bring :

Bachelor's or Master's degree in Computer Science, Information Systems, Engineering or equivalent.

6+ years of relevant experience with detailed knowledge of data warehouse technical architectures, infrastructure components, ETL/ ELT and reporting/analytic tools.

3+ years of experience with Cloud based DW such as Redshift, Snowflake etc.

3+ years’ experience in Big Data Distributed ecosystems (Hadoop, SPARK, Hive & Delta Lake)

3+ years experience in Workflow orchestration tools such as Airflow etc.

3+ years’ experience in Big Data Distributed systems such as Databricks, AWS EMR, AWS Glue etc.

Leverage monitoring tools/frameworks, like Splunk, Grafana, CloudWatch etc.

Experience with container management frameworks such as Docker, Kubernetes, ECR etc.

3+ year’s working with multiple Big Data file formats (Parquet, Avro, Delta Lake)

Experience working on CI/CD processes such as Jenkins, Codeway etc. and source control tools such as GitHub, etc.

Strong experience in coding languages like Python, Scala & Java

Knowledge and Skills

Fluent in relational based systems and writing complex SQL.

Fluent in complex, distributed and massively parallel systems.

Strong analytical and problem-solving skills with ability to represent complex algorithms in software.

Strong understanding of database technologies and management systems.

Strong understanding of data structures and algorithms

Database architecture testing methodology, including execution of test plans, debugging, and testing scripts and tools.

Strong analytical and problem-solving skills.

Nice to Have

Experience with transformation tools such as dbt