Required Skills

sql queries Data Engineering ETL/ELT Delta Lake Azure Data Factory

Work Authorization

  • Citizen

Preferred Employment

  • Full Time

Employment Type

  • Direct Hire

education qualification

  • UG :- - Not Required

  • PG :- - Not Required

Other Information

  • No of position :- ( 1 )

  • Post :- 22nd Jul 2022

JOB DETAIL

Hiring, Data Engineer 2 with below skills and expertise.

You will be responsible for

  • Research/Design/Prototyping of highly scalable ETL/ELT pipelines for Machine

Learning/Deep Learning models for Classification, Recognition, Attribute

Extraction and Clustering to

  • Shaping data for Machine Learning pipelines.
  • Identifying and implementing data storage and extraction techniques that provide

both cost and performance advantages.

  • Developing and maintaining Batch and/or Streaming Data Pipelines that move

terabytes of data daily.

  • Partitioning data for optimized storage and retrieval.
  • Maintaining Security and Privacy of data.
  • Harnessing third party tools and develop internal tooling to monitor and debug

pipelines.

  • Building and maintaining machine learning pipelines.
  • Answering key data questions for customers.

Experience and Expertise

  • 5+ years of experience in the relevant areas.
  • Writing and debugging complex SQL queries and managing schemas and indices

for terabyte scale databases.

  • Experienced in ETL/ELT and workflow orchestration tools such as Azure Data

Factory, Apache Airflow, Apache Beam etc.

Using common data storage formats such as Delta Lake, Parquet, Avro, etc.

Transforming and staging data for machine learning in cloud environments

Serverless ETL/ELT and workflow orchestration (e.g. Azure Data Factory, GCP

DataFlow)

Cloud Storage

Databricks

Serverless Query Engines (e.g. Azure Synapse, Google Big Query)

Machine Learning services (e.g. Azure ML, AWS SageMaker)

Or equivalent platforms/services in competing cloud environments like AWS or

GCP

  • Automated Cloud services using REST APIs.
  • Developed in Python.

Good to have

  • Designed and developed in Kubernetes environments.
  • Are experienced in Hybrid Windows/Linux development environment.
  • Implemented data functionality using Azure Spark.

Education

  • Masters or Bachelors degree in Computer Science/Computer

Applications/Business Analytics or other equivalent programs

Company Information