Required Skills

Big Data Python Py Spark Apache Spark Numpy Pandas AWS EMR

Work Authorization

  • Us Citizen

  • Green Card

Preferred Employment

  • Corp-Corp

Employment Type

  • Consulting/Contract

education qualification

  • UG :- - Not Required

  • PG :- - Not Required

Other Information

  • No of position :- ( 1 )

  • Post :- 6th Jan 2021

JOB DETAIL

Position    : Big Data Engineer

Location   : Mclean, Virginia

Duration   : 12+ Month

MOI          : Phone+ Skype

Visa           : GC/USC

Required Experience:

Must Have:

  • Python, Py Spark, Apache Spark, Numpy, Pandas
  • AWS EMR is A MUST

Nice to Have:

  • Snowflake

JOB DESCRIPTION:

 

  • Expertise in Python language developing data engineering applications (ETL) using Spark, pandas, numpy, pyarrow/fastparquet, pyspark, pytest, behave
  • Very good experience developing ETL pipelines on Apache Spark and experience with Hadoop, Databricks etc will be good to have.
  • Must have worked with AWS services like EMR, S3, EC2, Athena, ECS and knowledge of IAM, Step functions, Lambda’s is an add on.
  • Must have Advanced working SQL knowledge, understanding and writing complex SQL queries, fine tuning the performance of queries.
  • Experience with Analytical stores like Snowflake would be a plus.
  • Experience working on data pipelines that consume a wide variety of data formats e.g parquet, avro, json and csv from AWS S3.
  • Experience creating CI/CD pipelines using Jenkins.
  • Experience with the github version control system.
  • Experience working on Machine learning models will be a plus.
  • Experience with stream-processing systems: Spark-Streaming, Storm etc will be a plus.
  • Experience building and optimizing ‘big data’ data pipelines, architectures and data sets.
  • Experience performing root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement.
  • Strong analytic skills related to working with unstructured datasets.
  • Build processes supporting data transformation, data structures, metadata, dependency and workload management.
  • A successful history of manipulating, processing and extracting value from large disconnected datasets.
  • Strong project management and organizational skills.
  • Experience supporting and working with cross-functional teams in a dynamic environment.
  • Prior Capital One experience is a big plus.

Company Information