Job Search | Recruitment | Job Vacancies

Agile Cloud Architecture data pipeline architecture. data pipeline HDI & Spark HDI & Spark

Work with other data engineers, data ingestion specialists, and experts across the company to consolidate methods and tool standards where practical.
Work independently on complex data engineering problems to support data science strategy of products.
Use broad and deep technical knowledge in the data engineering space to tackle complex data problems for product teams, with a core focus on using technical expertise.
Improve the data availability by acting as a liaison between Lab teams and source systems.
Collect, blend, and transform data using ETL tools, database management system tools, and code development.
Implement data models and structures data in ready-for business consumption formats.
Aggregate data across various warehousing models (e.g. OLAP cubes, star schemas, etc.) for BI purposes.
Collaborate with business teams and understand how data needs to be structured for consumption.

Must Have Skills

5 years or more exp in a Data Engineer.
5 Year in an Agile environment.
Cloud Architecture exp,, preferably in an Azure environment.
Building and maintaining optimal data pipeline architecture.
Assembly of large, complex data sets that meet functional / non-functional business requirements.
Ability to identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, data quality checks, minimize Cloud cost, etc.
Exp, building the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL, Data Bricks, No-SQL.
Exp, building analytics tools that utilize the data pipeline to provide actionable insights into customer acquisition, operational efficiency and other key business performance metrics.
Document and communicate standard methods and tools used.
Exp, using the following software/tools:
Big data tools: Hadoop, HDI, & Spark
Relational SQL and NoSQL databases, including COSMOS
Data pipeline and workflow management tools: Data Bricks (Spark), ADF, Dataflow
Microsoft Azure
Stream-processing systems: Storm, Streaming-Analytics, IoT Hub, Event Hub
Object-oriented/object function scripting languages: Python, Scala, SQL.
Graduate degree in Computer Science, Statistics, Informatics, Information Systems or another quantitative field.