Lead data engineer and analyst to deliver data sets and analysis results as per business requirements
Assemble large, complex data sets that meets functional/Non-Functional business requirements.
Automating manual processes, optimizing data delivery, recommending platform greater scalability/improvements
Collaborate with initiative leads to optimize and enhance new capabilities
Mentor team in migrating Hadoop on-prem to cloud/AWS
Create and maintain optimal data pipeline architecture.
Presenting analysis results/recommendations using Powerpoint
Mandatory Requirements
Experience (10+ Years)
Requirements
This is an 100% Onsite
10+ yrs IT experience and good expertise in SDLC/Agile
3+ yrs Healthcare IT projects
5+ yrs in programing language any (Python, Java, Scala, Spark)
Hands on experience in writing advanced SQL queries, familiarity with variety of database
Experience in building and optimizing ‘Big-Data’ pipelines, architecture, and data sets
Tools: - Big-Data, Spark, Python, Scala
Cloud platform: - AWS S3, Snowflake
Experience in Hadoop file formats like ORC, Avro, Parquet, CSV
Hands on experience in migrating Hadoop on-prem to cloud/AWS
Experience in analyzing data using ‘Big-Data’ platform, AWS and Snowflake
Strong Analytical skills in relating multiple data sets and identify patterns
Experience in NoSQL databases like MongoDB and Cassandra
Visualize data sets using Power BI or Tableau
Desired skills:
Kafka streaming and Shell scripting
Scheduling tools like Control-M, Oozie
NoSQL databases like MongoDB
Implement Python flex APIs to share data insights to digital systems