Collaborate with and across Agile teams to design and develop data engineering solutions by rapidly delivering value to our customers.
Build distributed, low latency, reliable data pipelines ensuring high availability and timely delivery of data
Design and develop highly optimized data engineering solutions for Big Data workloads to efficiently handle continuous increase in data volume and complexity
Build highly performing real-time data ingestion solutions for streaming workloads.
Adhere to best practices and agreed upon design patterns across all Data Engineering solutions
Ensure the code is elegantly designed, efficiently coded, and effectively tuned for performance
Focus on data quality and consistency, implement processes and systems to monitor data quality, ensuring production data is always accurate and available for key stakeholders and business processes that depend on it.
Create design (Data Flow Diagrams, Technical Design Specs, Source to Target Mapping documents) and test (unit/integration tests) documentation
Perform data analysis required to troubleshoot data related issues and assist in the resolution of data issues.
Focus on end-to-end automation of data engineering pipelines and data validations (audit, balance controls) without any manual intervention
Focus on data security and privacy by implementing proper access controls, key management, and encryption techniques.
Take a proactive approach in learning new technologies, stay on top of tech trends, experimenting with new tools & technologies and educate other team members.
Collaborate with analytics and business teams to improve data models that feed business intelligence tools, increasing data accessibility, and fostering data-driven decision making across the organization.
Communicate clearly and effectively to technical and non-technical leadership.
Minimum Qualifications:
Education: Bachelor's degree in Computer Science, Computer Engineering, or relevant field
Work Experience: 7+ years of experience in architecting, designing and building Data Engineering solutions and Data Platforms
Experience in building Data Warehouses/Data Platforms on Redshift/Snowflake
Extensive experience building real-time data processing solutions.
Extensive experience building highly optimized data pipelines and data models for big data processing.
Experience working with data acquisition and transformation tools such as Fivetran and DBT
Experience building highly optimized & efficient data engineering pipelines using Python, PySpark, Snowpark
Experience working with distributed data processing frameworks such as Apache Hadoop, or Apache Spark or Flink
Experience working with real-time data streams processing using Apache Kafka, Kinesis or Flink
Experience working with various AWS Services (S3, EC2, EMR, Lambda, RDS, DynamoDB, Redshift, Glue Catalog)
Expertise in Advanced SQL programming and SQL Performance Tuning
Experience with version control tools such as GitHub or Bitbucket.
Expert level understanding of dimensional modeling techniques
Excellent communication, adaptability, and collaboration skills
Excellent analytical skills, strong attention to detail with emphasis on accuracy, consistency, and quality
Strong logical and problem-solving skills with critical thinking
Good to Have:
Experience in designing and building applications using Container and serverless technologies
Experience working with fully automated workflow scheduling and orchestration services such as Apache Airflow
Experience working with semi-structured, unstructured data, No SQL databases
Experience with CI/CD using GitHub Actions or Jenkins
Experience designing and building APIs
Experience working with low-code, no-code platforms