Required Skills

Gen AI Data Engineering ETL Jobs Snowflake Azure Cloud

Work Authorization

  • US Citizen

  • Green Card

  • EAD (OPT/CPT/GC/H4)

  • H1B Work Permit

Preferred Employment

  • Corp-Corp

  • W2-Permanent

  • W2-Contract

  • Contract to Hire

Employment Type

  • Consulting/Contract

education qualification

  • UG :- - Not Required

  • PG :- - Not Required

Other Information

  • No of position :- ( 1 )

  • Post :- 9th Jan 2026

JOB DETAIL

·     Design, develop, and maintain scalable data pipelines for ingesting, processing, and transforming large volumes of structured and unstructured data.

·     Implement efficient data processing workflows to support the training and evaluation of solutions using large language models, ensuring reliability, scalability, and performance.

·     Addressing issues related to data quality, pipeline failures, or resource contention, ensuring minimal disruption to systems.

·     Integrate Large Language Model into data pipeline for natural language processing tasks.

·     Working with Snowflake ecosystem

·     Deploying, scaling, and monitoring AI solutions on cloud platforms like Snowflake, Azure, AWS, GCP

·     Communicating technical and non-technical stakeholders and collaborate with cross-functional teams.

·     Cloud cost management and best practices to optimize cloud resource usage and minimize costs.

 

 

Data Engineer – Preferred Qualifications:

  • Experience working within the Azure ecosystem, including Azure AI Search, Azure Storage Blob, Azure Postgres and understanding how to leverage them for data processing, storage, and analytics tasks.
  • Experience with techniques such as data normalization, feature engineering, and data augmentation.
  • Ability to preprocess and clean large datasets efficiently using Azure Tools /Python and other data manipulation tools.
  • Expertise in working with healthcare data standards (ex. HIPAA and FHIR), sensitive data and data masking techniques to mask personally identifiable information (PII) and protected health information (PHI) is essential.
  • In-depth knowledge of search algorithms, indexing techniques, and retrieval models for effective information retrieval tasks. Familiarity with search platforms like Elasticsearch or Azure AI Search is a must.
  • Familiarity with chunking techniques and working with vectors and vector databases like Pinecone.
  • Experience working within the snowflake ecosystem.
  • Ability to design, develop, and maintain scalable data pipelines for ingesting, processing, and transforming large volumes of structured and unstructured data.
  • Experience with implementing best practices for data storage, retrieval, and access control to ensure data integrity, security, and compliance with regulatory requirements.
  • Be able to implement efficient data processing workflows to support the training and evaluation of solutions using large language models, ensuring reliability, scalability, and performance.
  • Ability to proactively identify and address issues related to data quality, pipeline failures, or resource contention, ensuring minimal disruption to systems.
  • Experience with large language model frameworks, such as Langchain and know how to integrate them into data pipelines for natural language processing tasks.

 

Company Information