Data Pipeline Development: Design, develop, and manage data pipelines to support AI and Generative AI data requirements.
Workflow Creation: Build self-service onboarding workflows in data federation platforms, particularly using AWS Athena, to facilitate efficient data access and integration.
Schema Management: Own the ingestion of schemas, metadata APIs (including table schema descriptions), and table registration services to enhance data governance.
SQL Execution Layer Design: Design and implement a SQL execution layer via AWS Athena that optimizes query performance and ensures data integrity.
Access Controls: Implement table access controls, audit logging, and schema diffing to maintain security and compliance across our data assets.
Collaboration: Work closely with data scientists, analysts, and other stakeholders to understand data requirements and ensure alignment with organizational goals.
Continuous Improvement: Identify opportunities for process enhancements and drive best practices in data engineering and management.