Responsibilities and essential job functions include but are not limited to the following:
- Develop set processes for data mining, data modeling, and data production
- Able to Install/update disaster recovery procedures
- Recommend different ways to constantly improve data reliability and quality.
- Contribute to the implementation of complex programs
- Recommend different ways to constantly improve data quality.
- Develop and maintain documentation related to all assigned systems and projects
- Experience in writing queries and functions running over massive rows of data running in a distributed query engine
- Perform root cause analysis to identify permanent resolutions
- Ability to learn new concepts and or technologies that are required to perform work
- Research new uses for existing data.
Job Qualifications
- Bachelor’s degree in computer science, management information systems, or related discipline, or equivalent work experience
- We are looking for strong hands on knowledge in the following areas:
- Strong experience with SQL, Spark, PySpark, AWS and Python is must
- Hands on data pipeline development, ingest patterns
- Minimum 2-3 years of experience in NoSQL database technologies
- Working knowledge of Data Warehouse concepts
- Understanding of Distributed database systems
- Minimum 1 year of CI/CD experience
- Exposure to SOA architecture