Tagcor

Required Skills

Apache Spark AWS Glue Google Dataflow Talend Hadoop Kafka Apache Flink Hive Presto MySQL PostgreSQL MongoDB Cassandra TensorFlow PyTorch Scikit-learn XGBoost AWS Azure GCP Databricks Celery RQ RabbitMQ Kafka Apache Pulsar Google Cloud PubSub Redshift BigQuery Synapse Analytics NGINX HAProxy MLflow DVC

Work Authorization

US Citizen
Green Card
EAD (OPT/CPT/GC/H4)
H1B Work Permit

Preferred Employment

Corp-Corp
W2-Permanent
W2-Contract
Contract to Hire

Employment Type

Consulting/Contract

education qualification

UG :- - Not Required
PG :- - Not Required

Other Information

No of position :- ( 1 )
Post :- 28th Jan 2025

JOB DETAIL

Data Ingestion: Build scalable ETL pipelines using Apache Spark, Talend, AWS Glue, Google Dataflow, Apache NiFi. Ingest data from APIs, file systems, and databases.

Data TransformationValidation: Use Pandas, Apache Beam, and Dask for data cleaning, transformation, and validation. Automate data quality checks with Pytest, Unittest.

Big Data Systems: Process large datasets with Hadoop, Kafka, Apache Flink, Apache Hive. Stream real-time data using Kafka, Google Cloud PubSub.

Task Queues: Manage asynchronous processing with Celery, RQ, RabbitMQ, or Kafka. Implement retry mechanisms and track task status.

Scalability: Optimize for performance with distributed processing (Spark, Flink), parallelization (joblib), and data partitioning.

CloudStorage: Work with AWS, Azure, GCP, Databricks. Store and manage data with S3, BigQuery, Redshift, Synapse Analytics, and HDFS.

Bigdata Engineer