UG :- - Not Required
PG :- - Not Required
No of position :- ( 1 )
Post :- 1st Jun 2022
We are seeking those with a passion for working on complex problems not offered elsewhere in the industry.
As the tech lead in the media data space, you will be responsible for complete ownership of a client’s system, managing and growing the global team that supports it, and be a thought-leader, helping the client strategize, and see through, their migration to the cloud. You’ll be responsible for expanding and optimizing the client’s data and data pipeline architecture, as well as optimizing data flow and collection for cross-functional teams. You will be supporting software developers, database architects, data analysts and data scientists on data initiatives, and will ensure optimal data delivery architecture is consistent throughout ongoing projects. As a senior resource, you must be self-directed, and comfortable supporting the data needs of multiple teams, systems, and products. The right candidate will be excited by the prospect of optimizing or even re-designing the client’s data architecture to support their next generation of products and data initiatives.
Deep knowledge of the various components of the Hadoop ecosystem and Java, and experience in applying them to practical problems, is a must.
· Senior data engineer with 6-8 years of hands-on experience working with ETL Hadoop
· Oozie, Pig, Hive, Spark, HBase, SQL
· Writing shell scripts on Unix platform
· Experience working with analyzing and extracting value from large unstructured and disconnected datasets
· Understanding of opensource development processes, as well as Distributed Systems and Systems Programming
· Git, JIRA, ServiceNow
· Experience with AWS, Linux administration, Presto, Screwdriver, RDBMS, C++
· Understanding of Data Models (Conceptual, Logical, and Physical, Dimensional, Object, Relational, Data Model Design), Data Profiling, Root Cause Analysis
· Good knowledge in writing complex queries in Teradata DB2 Oracle PL SQL
Nice to have:
· Knowledge of any one of the following is a plus: redis, Storm, Kafka, Splunk, Druid, Vertica, Looker, Jenkins, Groovy, Python, Scala, REST APIs, Jersey and JAX-RS, Maven, Oracle Database, NoSQL databases, geospatial data, Cassandra, Postgres, Sadana, Vespa, Sherpa, Elasticsearch, Avro, Talon, MapReduce, Azkaban, Luigi, Airflow, Spark Streaming
· Familiarity with GDPR, CCPA, COPPA