Us Citizen
Green Card
EAD (OPT/CPT/GC/H4)
H1B Work Permit
Corp-Corp
Consulting/Contract
UG :- - Not Required
PG :- - Not Required
No of position :- ( 1 )
Post :- 16th Aug 2021
• Implements and automates deployment of our distributed system for ingesting and transforming data from various types of sources (relational, event-based, unstructured).
• Designs and implements Spark Structured Streaming & API workflow
• Implements methods to continuously monitor and troubleshoot data quality and data integrity issues.
• Implements data governance processes and methods for managing metadata, access, retention to data for internal and external users.
• Develops reliable, efficient, scalable, and quality data pipelines with monitoring and alert mechanisms that combine a variety of sources using ETL/ELT tools or scripting languages.
• Develops physical data models and implements data storage architectures as per design guidelines.
• Analyzes complex data elements and systems, data flow, dependencies, and relationships in order to contribute to conceptual physical and logical data models.
• Participates in testing and troubleshooting of data pipelines.
• Develops and operates large scale data storage and processing solutions using different distributed and cloud based platforms for storing data (e.g. Data Lakes, Hadoop, Hbase, Cassandra, MongoDB, Accumulo, DynamoDB, others).
• Uses agile development technologies, such as DevOps, Scrum, Kanban and continuous improvement cycle, for data driven application, attends daily stand-ups.
Skills:
- Hands on experience in Spark Structured Streaming & API workflow
- SPARK, Scala/Java, Map-Reduce, Hive, Hbase, and Kafka, Microsoft Azure Databricks
- SQL query language
- Clustered compute cloud-based implementation experience
- Familiarity developing applications requiring large file movement for a Cloud-based environment
- Exposure to Agile software development
- Exposure to building analytical solutions
- Exposure to IoT technology