Required Skills

HBase

Work Authorization

  • Citizen

Preferred Employment

  • Full Time

Employment Type

  • Direct Hire

education qualification

  • UG :- - Not Required

  • PG :- - Not Required

Other Information

  • No of position :- ( 1 )

  • Post :- 27th Jul 2022

JOB DETAIL

  • You would be required to code in Scala and PySpark daily on Cloud as well as on-prem infrastructure- Build Data Models to store the data in a most optimized manner- Identify, design, and implement process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc
  • Implementing the ETL process and optimal data pipeline architecture- Monitoring performance and advising any necessary infrastructure changes
  • Create data tools for analytics and data scientist team members that assist them in building and optimizing our product into an innovative industry leader
  • Work with data and analytics experts to strive for greater functionality in our data systems
  • Proactively identify potential production issues and recommend and implement solutions- Must be able to write quality code and build secure, highly available systems
  • Create design documents that describe the functionality, capacity, architecture, and process
  • Review peer-codes and pipelines before deploying to Production for optimization issues and codestandards

What we are looking for

  • Good understanding of optimal extraction, transformation, and loading of data from awide variety of data sources using SQL and big data technologies
  • - Proficient understanding of distributed computing principles
  • Experience in working with batch processing/ real-time systems using various open-source technologies like NoSQL, Spark, Pig, Hive, Apache Airflow
  • - Implemented complex projects dealing with the considerable data size (PB)
  • Optimization techniques (performance, scalability, monitoring, etc)
  • Experience with integration of data from multiple data sources-
  • Experience with NoSQL databases, such as HBase, Cassandra, MongoDB, etc
  • Knowledge of various ETL techniques and frameworks, such as Flume- Experience with various messaging systems, such as Kafka or RabbitMQ- Creation of DAGs for data engineering- Expert at Python /Scala programming, especially for data engineering/ ETL purposes

 

 

Company Information