Required Skills

Data Warehouse Designer OR Developer

Work Authorization

  • Us Citizen

  • Green Card

  • EAD (OPT/CPT/GC/H4)

  • H1B Work Permit

Preferred Employment

  • Corp-Corp

Employment Type

  • Consulting/Contract

education qualification

  • UG :- - Not Required

  • PG :- - Not Required

Other Information

  • No of position :- ( 1 )

  • Post :- 8th Oct 2021

JOB DETAIL

• Participate in Team activities, Design discussions, Stand up meetings and planning review with team. • Perform data analysis, data profiling, data quality and data ingestion in various layers using big data/Hadoop/Hive/Impala queries, PySpark programs and UNIX shell scripts. • Follow the organization coding standard document, Create mappings, sessions and workflows as per the mapping specification document. • Perform Gap and impact analysis of ETL and IOP jobs for the new requirement and enhancements. • Create jobs in Hadoop using SQOOP, PYSPARK and Stream Sets to meet the business user needs. • Create mockup data, perform Unit testing and capture the result sets against the jobs developed in lower environment. • Updating the production support Run book, Control M schedule document as per the production release. • Create and update design documents, provide detail description about workflows after every production release. • Continuously monitor the production data loads, fix the issues, update the tracker document with the issues, Identify the performance issues. • Performance tuning long running ETL/ELT jobs by creating partitions, enabling full load and other standard approaches. • Perform Quality assurance check, Reconciliation post data loads and communicate to vendor for receiving fixed data. • Participate in ETL/ELT code review and design re-usable frameworks. • Create Remedy/Service Now tickets to fix production issues, create Support Requests to deploy Database, Hadoop, Hive, Impala, UNIX, ETL/ELT and SAS code to UAT environment. • Create Remedy/Service Now tickets and/or incidents to trigger Control M jobs for FTP and ETL/ELT jobs on ADHOC, daily, weekly, Monthly and quarterly basis as needed. • Model and create STAGE / ODS / Data warehouse Hive and Impala tables as and when needed. • Create Change requests, workplan, Test results, BCAB checklist documents for the code deployment to production environment and perform the code validation post deployment. • Work with Hadoop Admin, ETL and SAS admin teams for code deployments and health checks. • Create re-usable UNIX shell scripts for file archival, file validations and Hadoop workflow looping. • Create re-usable framework for Audit Balance Control to capture Reconciliation, mapping parameters and variables, serves as single point of reference for workflows. • Create PySpark programs to ingest historical and incremental data. • Create SQOOP scripts to ingest historical data from EDW oracle database to Hadoop IOP, created HIVE tables and Impala views creation scripts for Dimension tables. • Participate in meetings to continuously upgrade the functional and technical expertise.

Languages

Must have

English

Native or bilingual proficiency

Experience required

REQUIRED Skill Sets:

• 8+ years of experience with Big Data, Hadoop on Data Warehousing or Data Integration projects. • Analysis, Design, development, support and Enhancements of ETL/ELT in data warehouse environment with Cloudera Bigdata Technologies (with a minimum of 8 years’ experience in Hadoop, MapReduce, Sqoop, PySpark, Spark, HDFS, Hive, Impala, StreamSets, Kudu, Oozie, Hue, Kafka, Yarn, Python, Flume, Zookeeper, Sentry, Cloudera Navigator) along with Oracle SQL/PL-SQL, Unix commands and shell scripting; • Strong development experience (minimum of 8 years) in creating Sqoop scripts, PySpark programs, HDFS commands, HDFS file formats (Parquet, Avro, ORC etc.), StreamSets pipeline creation, jobs scheduling, hive/impala queries, Unix commands, scripting and shell scripting etc. • Writing Hadoop/Hive/Impala scripts (minimum of 8 years’ experience) for gathering stats on table post data loads. • Strong SQL experience (Oracle and Hadoop (Hive/Impala etc.)). • Writing complex SQL queries and performed tuning based on the Hadoop/Hive/Impala explain plan results. • Proven ability to write high quality code. • Experience building data sets and familiarity with PHI and PII data. • Expertise implementing complex ETL/ELT logic. • Develop and enforce strong reconciliation process. • Accountable for ETL/ELT design documentation. • Good knowledge of Big Data, Hadoop, Hive, Impala database, data security and dimensional model design. • Basic knowledge of UNIX/LINUX shell scripting. • Utilize ETL/ELT standards and practices towards establishing and following centralized metadata repository. • Good experience in working with Visio, Excel, PowerPoint, Word, etc. • Effective communication, presentation, & organizational skills. • Familiar with Project Management methodologies like Waterfall and Agile • Ability to establish priorities & follow through on projects, paying close attention to detail with minimal supervision. • Required Education: BS/BA degree or combination of education & experience

DESIRED Skill Sets:

• Demonstrate effective leadership, analytical and problem-solving skills • Required excellent written and oral communication skills with technical and business teams. • Ability to work independently, as well as part of a team • Stay abreast of current technologies in area of IT assigned • Establish facts and draw valid conclusions • Recognize patterns and opportunities for improvement throughout the entire organization • Ability to discern critical from minor problems and innovate new solutions

 

Company Information