Tagcor

Required Skills

Data Engineer

Work Authorization

US Citizen
Green Card
EAD (OPT/CPT/GC/H4)
H1B Work Permit

Preferred Employment

Corp-Corp
W2-Permanent
W2-Contract
Contract to Hire

Employment Type

Consulting/Contract

education qualification

UG :- - Not Required
PG :- - Not Required

Other Information

No of position :- ( 1 )
Post :- 27th Nov 2025

JOB DETAIL

Day to day:

- This role will focus on workflow management for data sets in the genomic space. Ability to create workflow pipelines by way of data management and engineering.
- Utilize Python, AWS, and Kubernetes to design, develop, optimize, and maintain scalable bioinformatics workflows for processing and analyzing large-scale genomics datasets in the cloud and in-house
- Include a flexible modular architecture into the workflows to enable the exchange of analysis components and different algorithms
- Implement the bioinformatics data processing pipelines using workflow management tools and programming languages such as Python
- Work with team members to perform quality control and validation of pipelines to ensure accuracy and reproducibility of results
- Document the development processes, including code, workflows, data flow diagrams, and standard operating procedures, following software development and DataOps best practices

Qualifications: (Recruiter)

Must Haves:
- Bachelors or higher in Engineering (prefer someone outside of Biology/ sciences)
- Open on years’ experience as long as they have the following:
  - Ability to create robust & scalable data-workflows/ pipelines
  - Python
  - AWS
  - Kubernetes

Plus Haves:
- Life Sciences/ Bioinformatics/ Genomics background
Perfect fit:
Disqualifiers:

Interview Process: (Account Manager)

45 minute skills assessment – candidate will access a Github file/ doc. Focused in Python skills
Teams w/ Maurcio, panel w/ his boss and 2 engineers on the team

Ending Questions: (Account Manager)

When can we put some time in calendar to walk through candidates we are coming across? Thurs
Are there any other recruiting companies or internal HR working on this role? Yes LOTS
Are there any other people in process? No. Will be setting up interviews next week

Project Scope and Brief Description:

The position is for work in the bioinformatics space, principally writing new and/or maintaining existing bioinformatics workflows and pipelines such as an Eukaryote Genome Annotation Pipeline. As such the role requires knowledge of Cloud technologies (AWS, Kubernetes, Container orchestration) as well as experience with industry-level scientific workflow management.

Responsibilities:

Design, develop, optimize, and maintain scalable bioinformatics workflows for processing and analyzing large-scale genomics datasets in the cloud and in-house
Include a flexible modular architecture into the workflows to enable the exchange of analysis components and different algorithms
Implement the bioinformatics data processing pipelines using workflow management tools and programming languages such as Python
Work with team members to perform quality control and validation of pipelines to ensure accuracy and reproducibility of results
Document the development processes, including code, workflows, data flow diagrams, and standard operating procedures, following software development and DataOps best practices

Skills / Experience:

Required Qualifications

Previous experience developing industrial scale scientific data workflows.
Strong programming skills in Python including libraries for Data Science such as NumPy, Pandas, NetworkX, matplotlib, etc.
Working knowledge of container technologies (such as Docker, ContainerD, or Podman) and container orchestration.
Experience with data pipeline tools (like Argo, Ray, AirFlow, Redun or NextFlow).

Familiarity with the AWS platform (IAM, EC2, S3, CloudWatch, Spot instances) and Kubernetes, EKS, ECS, AWS Batch or other Cloud compute architectures.

Ability to work both independently and collaboratively with good communication skills. Interest in learning new technologies

Preferred Qualifications

Specific experience analyzing large genomics datasets
Familiarity with common bioinformatics tools and datatypes for the analysis of NextGen sequencing data
Familiarity with statistical analysis methods and tools commonly used in bioinformatics analysis such as Gene Expression or ChIPSeq
Knowledge of any additional programming languages such as C, Rust, Perl, R, Unix Shell or others