Tagcor

Databricks Hadoop Hive PySpark

5+ year of experience with object-oriented/object function scripting languages: Python, Java, etc
3+ years of leading development of large-scale cloud-based services with platforms like AWS, GCP or Azure and developing and operating cloud-based distributed systems.
Experience building and optimizing data pipelines, architectures, and data sets.
Build processes supporting data transformation, data structures, metadata, dependency, and workload management
Strong computer science fundamentals in data structures, algorithm design, problem solving, and complexity
Working knowledge of message queuing, stream processing, and highly scalable bigdata data stores.
Software development experience in big data technologies Databricks, Hadoop, Hive, Spark(PySpark)
Familiarity with distributed systems and computing at scale.
Advanced working experience with databases SQL & NoSQL is required.
Proficiency in data processing using technologies like Spark Streaming, Spark SQL,
Expertise in developing big data pipelines using technologies like Kafka, Storm,
Experience with large scale data warehousing, mining or analytic systems.
Ability to work with analysts to gather requirements and translate them into data engineering tasks? Aptitude to independently learn new technologies.
Experience automating deployments with continuous integration and continuous delivery systems
Experience with DevOps , automation using Terraform or similar products are preferred .