US Citizen
Green Card
EAD (OPT/CPT/GC/H4)
H1B Work Permit
Corp-Corp
W2-Permanent
W2-Contract
Contract to Hire
Consulting/Contract
UG :- - Not Required
PG :- - Not Required
No of position :- ( 1 )
Post :- 22nd Nov 2023
· Expert-level knowledge of data frameworks, data lakes and open-source projects such as Apache Spark, MLflow, and Delta Lake
· Expert-level hands-on coding experience in Spark/Scala,Python or Pyspark
· Expert proficiency in Python, C++, Java, R, and SQL
· Mid-level knowledge of code versioning tools [such as Git, Bitbucket or SVN]
· In depth understanding of Spark Architecture including Spark Core, Spark SQL, Data Frames, Spark Streaming, RDD caching, Spark MLib
· IoT/event-driven/microservices in the cloud- Experience with private and public cloud architectures, pros/cons, and migration considerations.
· Extensive hands-on experience implementing data migration and data processing using AWS/Azure/GCP services
· Expertise in using Spark SQL with various data sources like JSON, Parquet and Key Value Pair
· Extensive hands-on experience with the Technology stack available in the industry for data management, data ingestion, capture, processing, and curation: Kafka, StreamSets, Attunity, GoldenGate, Map Reduce, Hadoop, Hive, Hbase, Cassandra, Spark, Flume, Hive, Impala, etc.
· Experience using Azure DevOps and CI/CD as well as Agile tools and processes including Git, Jenkins, Jira, and Confluence
· Experience in creating tables, partitioning, bucketing, loading and aggregating data using Spark SQL/Scala
· Able to build ingestion to ADLS and enable BI layer for Analytics
· Experience in Machine Learning Studio, Stream Analytics, Event/IoT Hubs, and Cosmos
· Strong understanding of Data Modeling and defining conceptual logical and physical data models.
· Proficient level experience with architecture design, build and optimization of big data collection, ingestion, storage, processing, and visualization
· Working knowledge of RESTful APIs, OAuth2 authorization framework and security best practices for API Gateways
· Familiarity of working with unstructured data sets (i.e. voice, image, log files, social media posts, email)
· Experience in handling escalations from customer’s operational issues.
Responsibilities:
· Work closely with team members to lead and drive enterprise solutions, advising on key decision points on trade-offs, best practices, and risk mitigation
· Guide customers in transforming big data projects,including development and deployment of big data and AI applications
· Educate clients on Cloud technologies and influence direction of the solution.
· Promote, emphasize, and leverage big data solutions to deploy performant systems that appropriately auto-scale, are highly available, fault-tolerant, self-monitoring, and serviceable
· Use a defense-in-depth approach in designing data solutions and AWS/Azure/GCP infrastructure
· Assist and advise data engineers in the preparation and delivery of raw data for prescriptive and predictive modeling
· Aid developers to identify, design, and implement process improvements with automation tools to optimizing data delivery
· Build infrastructure required for optimal extraction, loading and transformation of data from a wide variety of data sources
· Work with the developers to maintain and monitor scalable data pipelines
· Perform root cause analysis to answer specific business questions and identify opportunities for process improvement
· Build out new API integrations to support continuing increases in data volume and complexity
· Implement processes and systems to monitor data quality and security, ensuring production data is accurate and available for key stakeholders and the business processes that depend on it
· Employ change management best practices to ensure that data remains readily accessible to the business
· Maintain tools, processes and associated documentation to manage API gateways and underlying infrastructure
· Implement reusable design templates and solutions to integrate, automate, and orchestrate cloud operational needs
· Experience with MDM using data governance solutions
Qualifications:
· Overall experience of 12+ years in the IT field.
· 2+ years of hands-on experience designing and implementing multi-tenant solutions using Azure Databricks for data governance, data pipelines for near real-time data warehouse, and machine learning solutions.
· 3+ years of design and development experience with scalable and cost-effective Microsoft Azure/AWS/GCP data architecture and related solutions
· 5+ years’ experience in a software development, data engineering, or data analytics field using Python, Scala, Spark, Java, or equivalent technologies
· Bachelor’s or Master’s degree in Big Data, Computer Science, Engineering, Mathematics, or similar area of study or equivalent work experience
· Nice to have-
· Advanced technical certifications: Azure Solutions Architect Expert,
· AWS Certified Data Analytics, DASCA Big Data Engineering and Analytics
· AWS Certified Cloud Practitioner, Solutions Architect.
· Professional Google Cloud Certified