Required Skills

Monte Carlo Data Metaplane Soda bigeye

Work Authorization

  • US Citizen

  • Green Card

  • EAD (OPT/CPT/GC/H4)

Preferred Employment

  • Corp-Corp

  • W2-Permanent

  • W2-Contract

  • Contract to Hire

Employment Type

  • Consulting/Contract

education qualification

  • UG :- - Not Required

  • PG :- - Not Required

Other Information

  • No of position :- ( 1 )

  • Post :- 28th Dec 2023

JOB DETAIL

  • Architect, build, and maintain scalable and reliable data pipelines including robust data quality as part of data pipeline which can be consumed by analytics and BI layer.
  • Design, develop and implement low-latency, high-availability, and performant data applications and recommend & implement innovative engineering solutions.
  • Design, develop, test and debug code in Python, SQL, PySpark, bash scripting.
  • Design and implement data quality framework and apply it to critical data pipelines to make the data layer robust and trustworthy for downstream consumers.
  • Design and develop orchestration layer for data pipelines which are written in SQL, Python and PySpark.
  • Apply and provide guidance on software engineering techniques like design patterns, code refactoring, framework design, code reusability, code versioning, performance optimization, and continuous build and Integration (CI/CD) to make the data analytics team robust and efficient.

Requirements:

  • 5+ years of experience with Bachelors / master’s degree in computer science, Engineering, Applied mathematics or related field.
  • Extensive hands-on development experience in Python, SQL and Bash.
  • Extensive Experience in performance optimization of data pipelines.
  • Extensive hands-on experience working with cloud data warehouse and data lake platforms like Databricks, Redshift or Snowflake.
  • Familiarity with building and deploying scalable data pipelines to develop and deploy Data Solutions using Python, SQL, PySpark.
  • Extensive experience in developing end-to-end orchestration layer for data pipelines using frameworks like Apache Airflow, Prefect, Databricks Workflow.
  • Familiar with :
    • RESTful Webservices (REST APIs) to be able to integrate with other services.
    • API Gateways like APIGEE to secure webservice endpoints.
    • Data pipelines, Concurrency and parallelism.
  • Experience in creating and configuring continuous integration/continuous deployment using pipelines to build and deploy applications in various environments and use best practices for DevOps to migrate code to Production environment.
  • Ability to investigate and repair application defects regardless of component: front-end, business logic, middleware, or database to improve code quality, consistency, delays and identify any bottlenecks or gaps in the implementation.
  • Ability to write unit tests in python using unit test library like Pytest.
  • Experience in using and implementing data observability platforms like Monte Carlo Data, Metaplane, Soda, bigeye or any other similar products.
  • Expertise in debugging issues in Cloud environment by monitoring logs on the VM or using AWS features like Cloudwatch.
  • Experience with DevOps tech stack like Jenkins and Terraform.
  • Experience working with concept of Observability in software world and experience with tools like Splunk, Zenoss, Datadog or similar.
  • Experience in developing and implementing Data Quality framework either home grown or using any open-source frameworks like Great Expectations, Soda, Deequ.

 

Company Information