Implement and support end-to-end data lake/warehousing/mart/business intelligence/ analytics/ services solutions (ingest, storage, integration, processing, services, access) in AWS
data lake data intake/request/onboarding services and service documentation
data lake ingestion services for batch/real time data ingest and service documentation
data processing services (ETL/ELT) for batch/real time (Glue/Kinesis/EMR) and service documentation
data storage services for data lake (S3)/ data warehouses (RDS/Redshift)/ data marts and service documentation
data services layer including Athena, Redshift, RDS, microservices and APIs
pipeline orchestration services including lambda, step functions, MWAA (optional)
data security services (IAM/KMS/SM/encryption/anonymization/RBAC) and service documentation
data access provisioning services (Accounts, IAM Roles RBAC), processes, documentation and education
data provisioning services for data consumption patterns including microservices, APIs and extracts
metadata capture and catalog services for data lake(S3/Athena), data warehouses (RDS/Redshift), Microservices/APIs
metadata capture and catalog services for pipeline/log data for monitoring /support
Implement CI/CD pipelines
Prepare documentation for data projects utilizing AWS based enterprise data platform
Implement high velocity streaming solutions using Amazon Kinesis, SQS, and SMS
Migrate data from traditional relational database systems to AWS relational databases such as Amazon RDS, Aurora, and Redshift
Migrate data from traditional file systems and NAS shares to AWS data lake (S3) and relational databases such as Amazon RDS, Aurora, and Redshift
Migrate data from APIs to AWS data lake (S3) and relational databases such as Amazon RDS, Aurora, and Redshift
Provide cost/spend monitoring reporting for AWS based data platform data initiatives
Provide governance/audit reporting for access of AWS based data platform
Lead the implementation of a data lake strategy to enable LOBs and Corporate Functions with a robust, holistic view of data - driven decision making
Serve as delivery lead for EDP data initiatives product owner
Partner with immediate engineering team, product owner, IT, partners on EDP agenda
Provide technology thought leadership, consulting, and coaching/mentoring
Establish development, qa, stage and production migration/support processes
Establish best practices for development and support teams
Work with scrum master to develop and own backlog, stories, epics, sprints
Qualifications
Bachelor's degree in Computer Science, Software Engineering, MIS or equivalent combination of education and experience
Experience implementing, supporting data lakes, data warehouses and data applications on AWS for large enterprises
Programming experience with Java, Python/ Scala, Shell scripting
Solid experience of AWS services such as CloudFormation, S3, Glue, EMR/ Spark, RDS, Redshift, DynamoDB, Lambda, Step Functions, IAM, KMS, SM etc.
Solid experience implementing solutions on AWS based data lakes
Experience implementing metadata solutions leveraging AWS non-relational data solutions such as ElastiCache and DynamoDB
AWS Solutions Architect or AWS Big Data Certification preferred
Experience in AWS data lake/data warehouse/business analytics
Experience and understanding of various core AWS services such as IAM, Cloud Formation, EC2, S3, EMR/Spark, Glue, Datasync, CloudHealth, CloudWatch, Lambda, Athena, and Redshift
Experience in system analysis, design, development, and implementation of data ingestion pipeline in AWS
Experience with DevOps and Continuous Integration/ Delivery (CI/ CD) concepts and tools
Experience with business intelligence tools such as Tableau, Power BI or equivalent
Knowledge of ETL/ ELT
Experience in production support from Level 1 to Level 3
Awareness of Data Management & Governance tools
Working experience with Hadoop, HDFS, SQOOP, Hive, Python, and Spark is desired