Assist with the creation of optimal data pipeline architecture using AWS and Snowflake.
Construct new data integration workflows and work with application developers to re-engineer existing ones that are based in PHP, Perl, Python, and Oracle.
Assist in conceptualizing and developing the architecture that will enable real-time data streaming from a variety of sources, including ‘big data’ and semi-structured types (JSON, Parquet).
Work closely with internal data analysts and stakeholders to ensure data needs are being met in the data warehouse and any technical issues are resolved.
Ensure strict adherence to data governance and security standards, and advise on new measures.
Maintain and update documentation detailing progress and test results to keep project team informed.
Prepare data and configure access for analytic and data science tools such as Jaspersoft, Dataiku, Salesforce, and new products.
Emphasize automation in all areas of data collection, management, and analysis with auditing for quality assurance.
Incremental and historical data extracts from Oracle
Batch load workflow construction using orchestration tool such as Apache Airflow and leverage APIs.
Build data structures in Snowflake and create SQL scripts based on source-to-target logic defined in the data model.
Work with complex and large data sets, new integrations, and real-time streaming requirements.
Create data marts based on user/member requirements.
Knowledge of AWS (EC2, Lambda, S3, Data pipeline services) and Snowflake is highly preferred.
Oracle and Salesforce knowledge is preferred, along with any experience in the cybersecurity domain.
Advanced working SQL knowledge for constructing DML/DDL based on source-to-target logic is essential.
Experience with Python for ELT construction and orchestration tools such as Apache Airflow or Luigi.
Experience with stream processing systems and big data tools (Spark, Kafka, Snowpipe, etc)
Data model and Snowflake Data Warehouse.
DW integration with data analytic and data science tools.
Standardized ETL tools and procedures w/ workflow monitoring.
User and metadata management within AWS and Snowflake.
Fully optimized and automated ETL workflows
Member/Customer Direct Data Access
Embedded Machine Learning/AI
Connectivity with SIEM, TIP, and all other data applications.