Experience with Data Extraction, Data cleaning, Data screening, Data Exploration and Data Visualization of structured and unstructured datasets
Skilled in Big Data Technologies like Spark, Spark SQL, PySpark, HDFS (Hadoop), MapReduce
Excellent exposure to Data Visualization tools such as Tableau, PowerBI, Looker
Experience with one or more of the major cloud platforms & cloud services such as - Azure/AWS/GCP, Databricks, Azure HD Insights, ADF or similar, Google Dataflow, Google DataProc
Advanced working SQL knowledge and experience working with relational databases, query authoring (SQL) as well as working familiarity with a variety of databases.
Experience in working with Airflow
Experience building and optimizing ‘big data’ data pipelines, architectures and data sets
Strong analytic skills related to working with unstructured datasets
Working knowledge of highly scalable ‘big data’ data stores
A successful history of manipulating, processing and extracting value from large, disconnected datasets
Build processes supporting data transformation, data structures, metadata, dependency and workload management