Required Skills

Hive Unit Testing Hadoop Big Data Spark Data Analytics Python Sql

Work Authorization

  • Citizen

Preferred Employment

  • Full Time

Employment Type

  • Direct Hire

education qualification

  • UG :- - Not Required

  • PG :- - Not Required

Other Information

  • No of position :- ( 1 )

  • Post :- 31st May 2022

JOB DETAIL

  •  This position is responsible for hands-on design development in Spark and Python (PySpark) along with other Hadoop ecosystems like HDFS , Hive , Hue , Impala , Zeppelin etc
  • The purpose of position includes- Analysis , design and implementation of business requirements using PySpark Python
  • Cloudera Hadoop development around Big Data
  • Solid SQL experience
  • Development experience with PySpark SparkSql with good analytical debugging skills
  • Development work for building new solutions around Hadoop and automation of operational tasks
  • Assisting team and troubleshooting issues
  • Duties and Responsibilities Design and development around PySpark , Python and Hadoop Framework
  • Experience with RDD and Data Frames within Spark
  • Experience with data analytics , and working knowledge of big data infrastructure including Hadoop Ecosystems like HDFS , Hive , Spark etc
  • Work with gigabytes/terabytes of data and must understand the challenges of transforming and enriching such large datasets
  • Provide effective solutions to address the business problems strategic and tactical
  • Collaboration with team members , project managers , business analysts and business users in conceptualizing , estimating and developing new solutions and enhancements
  • Work closely with the stake holders to define and refine the big data platform to achieve company product and business objectives
  • Collaborate with other technology teams and architects to define and develop cross-functional technology stack interactions
  • Read , extract , transform , stage and load data to multiple targets , including Hadoop and Oracle
  • Develop automation scripts around Hadoop framework to automate processes and existing flows around
  • Should be able to modify existing programming/codes for new requirements
  • Unit testing and debugging
  • Perform root cause analysis (RCA) for any failed processes
  • Document existing processes as well as analyze for potential automation and performance improvements
  • Convert business requirements into technical design specifications and execute on them
  • Execute new development as per design specifications and business rules/requirements
  • Job Description epsilon
  • com Participate in code reviews and keep applications/code base in sync with version control
  • Effective communication , self-motivated with ability to work independently yet still aligned within a team environment
  • Required Skills Bachelors in computer science (or equivalent) or Masters with 3+ years of experience with big data against ingestion , transformation and staging using following technologies / principles / methodologies: Design and solution capabilities
  • Rich experience with Hadoop distributed frameworks , handling large amount of big data using Apache Spark and Hadoop Ecosystems
  • Python Spark (SparkSQL , PySpark) , HDFS , Hive , Impala , Hue , Cloudera Hadoop , Zeppelin
  • Proficient knowledge of SQL with any RDBMS
  • Knowledge of Oracle databases and PL/SQL
  • Working knowledge and good experience in Unix environment and capable of Unix Shell scripts (ksh , bash)
  • Basic Hadoop administration knowledge
  • DevOps Knowledge is an added advantage
  • Ability to work within deadlines and effectively prioritize and execute on tasks
  • Strong communication skills (verbal and written) with ability to communicate across teams , internal and external at all levels
  • Certifications Anyone of these:
  • CCA Spark and Hadoop Developer
  • MapR Certified Spark Developer (MCSD)
  • MapR Certified Hadoop Developer (MCHD)
  • HDP Certified Apache Spark Developer
  • HDP Certified Developer
  • Preferred Skills Technical: o Working knowledge of Oracle databases and PL/SQL
  • o Hadoop Admin Dev-Ops
  • o Experience with JIRA for user-story/bug tracking o Experience with GIT/Bitbucket Non-Technical: o Familiarity with SDLC and development/migration processes o Good analytical thinking and problem-solving skills
  • o Ability to diagnose and troubleshoot problems quickly
  • o Motivated to learn new technologies , applications and domain
  • o Possess appetite for learning through exploration and reverse engineering
  • o Strong time management skills
  • o Ability to take full ownership of tasks and projects
  • Job Description epsilon
  • com Behavioral Attributes: o Team player with excellent interpersonal skills
  • o Good verbal and written communication
  • o Possess Can-Do attitude to overcome any kind of challenges

Required Skills

  • SQL, Development,
  • Debugging Skills, Spark, Data Analytics,
  • Big Data Infrastructure, Programming, Computer Science, Big Data, Hadoop Distributed Frameworks, Unix, Unix Shell, Communication Skills, Ability to communicate, JIRA, Bug Tracking, GIT, Bitbucket, Time Management Skills, Team Player, Interpersonal Skills, Hands-On Design, Python, Hadoop, HDFS, Hive, Impala, Analysis, Design and Implementation, Business Requirements, Hadoop Development, Development Work, Automation, Troubleshooting, Design and Development, Hadoop Framework, Datasets, Strategic, Collaboration, Estimating, Developing, Cross-Functional, Oracle, Develop Automation, Unit Testing, Debugging, Root Cause Analysis, Potential Automation, Technical Design Specifications, New Development, Design Specifications, Business Rules, Code Reviews, Code Base, Version Control, Effective Communication, Motivated, Ability to work independently, Design, Apache Spark, RDBMS, Oracle Databases, PL/SQL, Ksh, Bash, Hadoop Administration, DevOps, Prioritize, Non-Technical, SDLC, Migration, Analytical Thinking, Problem-Solving Skills, Learning, Reverse Engineering, Take Full Ownership, Written Communication, Can-Do Attitude

Company Information