Required Skills

Python AWS ETL

Work Authorization

  • Us Citizen

  • Green Card

  • EAD (OPT/CPT/GC/H4)

  • H1B Work Permit

Preferred Employment

  • Corp-Corp

Employment Type

  • Consulting/Contract

education qualification

  • UG :- - Not Required

  • PG :- - Not Required

Other Information

  • No of position :- ( 1 )

  • Post :- 7th Jan 2021

JOB DETAIL

Client: TCS

Title:  Data Engineer with Python, AWS, ETL

Location : Tempe, AZ

 

Job Description

  • The candidate must have overall 8-12 years of experience in ETL and Data Warehouse, Data Lake, data quality & etc.
  • Minimum 2+ years of experience in implementing AWS technologies like AWS S3, Glue, AWS RDS, PySpark, Python, Confluent Kafka, etc.
  • Minimum 2+ years of experience in implementing ETL technologies Informatica, BODS, SSIS etc
  • Expertise in scripting technologies like Python/Spark/pyspark/Linux
  • Batch solution (aws glue/aws pipeline).
  • Distributed compute solution (Spark, EMR)
  • Analyze data through standard SQL (Athena)
  • Functional solution (aws lambda)
  • Distributed storage (redshift, S3)
  • Real-time solution (kafka, aws kinesis)
  • Experience / understanding of Agile, SCRUM and CI/CD.
  • Experience with Redshift query optimization, conversion and execution
  • Design and Build ETL jobs to support customer Data Lake, Enterprise data warehouse
  • Need to have comprehensive understanding on ETL concepts and Cross Environment Data Transfers
  • Write Extract-Transform-Load (ETL) jobs and Spark/Hadoop jobs to calculate business metrics
  • AWS Pipeline knowledge to develop ETL for data movement to Redshift, experience to map the source to target rules and fields.
  • Experience in migrating data from an on-premise/traditional big data system, relational databases, pgSQL , data lake and data warehouse to AWS native service
  • Working knowledge of AWS eco-system and data analytics services like queing, notification, lambda, aws batch etc
  • Good knowledge on AWS environment and Service knowledge with S3 storage understanding.
  • Hands on Experience with data import /export mechanism to Redshift from S3
  • Hands on Experience in writing postgres procedures, functions and views in Redshift database
  • Hands on experience of Amazon Redshift Architecture.
  • AWS Redshift experience for creating DB objects
  • AWS Pipeline knowledge to develop ETL for data movement to Redshift, experience to map the source to target rules and fields.
  • Strong in writing complex queries with nested joins and derived tables.
  • Candidate should have good knowledge of Greenplum/Postgres database.
  • Able to perform technical root cause analysis and outline corrective action for given problems Proficient SQL/Performance tuning skills for Redshift Database
  • Should be flexible to overlap US / India business hours
  • Fluent in complex, distributed and massively parallel systems.
  • Ability to lay pipelines across multiple systems is a key requirement
  • AWS Lake Formation with EMR knowledge is added on
  • AWS Certified applicants preferable

Company Information