0.6 - 1 years of real-world business experience in a Data Engineer role.
Bachelors degree, preferably in a quantitative subject.
Familiarity with a variety of technical tools (NumPy, PANDAS) for the manipulation
of data.
Strong experience with scripting in Python.
Experience in Apache Spark and Apache Flink (Highly Desirable)
Experience in working with Apache Kafka, Kafka Connect .
Very strong analytical in SQL.
Knowledge in Machine Learning and Data Science would be an added advantage.
Good Communication skills required.
Key Responsibilities
Create and maintain optimal data pipeline architecture for both stream processing and batch processing,
Assemble large, complex data sets that meet functional / non-functional business requirements.
Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL and AWS big data technologies.
Build analytics tools that utilize the data pipeline to provide actionable insights into customer acquisition, operational efficiency and other key business performance metrics.
Work with stakeholders including the Business, Product, Data and Design teams to assist with data- related technical issues and support their data infrastructure needs.
Work with data and analytics experts to strive for greater functionality in our data systems.