Responsibilities
- Analyze and organize raw data
- Build data systems and pipelines
- Evaluate business needs and objectives
- Interpret trends and patterns
- Conduct complex data analysis and report on results
- Prepare data for prescriptive and predictive modelling
- Build algorithms and prototypes
- Combine raw information from different sources
- Explore ways to enhance data quality and reliability
- Identify opportunities for data acquisition
- Develop analytical tools and programs
- Collaborate with data scientists and architects on several projects
To be successful in this role, you will have the following experience:
- 4+ years of experience working with the Big Data ecosystem and 5+ years of software development experience.
- Technical expertise with data models, data mining, and segmentation techniques
- Hands-on experience with SQL database design
- Design and implement data pipelines in Spark/Python for Data Lakes and Data Warehouse.
- Experience with Palantir or AWS including any or all of EC2, S3, IAM, EKS and RDS.
- Working with a distributed framework such as Hadoop, and Spark.
- Querying with Big Data tools like Hive, Impala, Presto.
- Good understanding of data architecture principles including data modelling.
- Exposure to work in an Agile environment.
- A broad knowledge of technical solutions, design patterns, and code for medium/complex applications deployed in the distributed computing environment.
- Experienced with handling data pipelines within HR domain
- Experience in Python and key data libraries (Pandas, pySpark)
- Working knowledge in API-based integration and lambda architecture for the modern system
- Working knowledge in data streaming and various messaging system
- Great numerical and analytical skills
- Degree in Computer Science, IT, or similar field; a Master’s is a plus