Lead & mentor other MLOps, DevOps and Data Scientists
Document processes and best practices
Collaborate and plan work estimates closely with other teams
Lead, define and influence technical choices within interdisciplinary settings
Continuously improve self-service tools and processes to reduce cycle times for developers and automate repetitive and wasteful operations.
Programming API’s and tooling to deploy and manage all aspects of the data science lifecycle.
Identify and remedy single points of failure and security risks.
Manage data systems, access rights, security boundaries and compute infrastructure. Maintain components of DevOps platform including Kubernetes, GitLab CI/CD, and Terraform. Interface with external logging, monitoring, and security vendors.
Maintain components of MLOps platform including Jupyterlab, MLFlow and SageMaker.
Build out infrastructure code and provision resources in Terraform.
Assist in troubleshooting issues within production.
Maintain dependencies, vulnerabilities with package managers, like npm, pip3 and PyPy including the creation of eggs and client libraries
Work to scale throughput and extensibility of our services with model pipelines.
Minimum of five years relevant general programming, data science and/or operations experience.
Minimum of one year relevant Kubernetes administration experience in a production setting.
Minimum of one year relevant Terraform experience in a production setting. Knowledge or experience with Golang, Java, Javascript, Python, and Bash.
Knowledge or experience in Linux-based infrastructures and Linux/Unix administration.
Knowledge or experience with databases such as MySQL, Elasticsearch, or Redis.
Knowledge or experience with project management and workflow tools, including Agile, Jira, and Scrum/Kanban.
Knowledge or experience with open-source technologies and cloud services.
Knowledge or experience in software development and infrastructure development.
Knowledge or experience with Amazon Web Services (AWS) cloud offerings and other cloud providers.
Knowledge or experience administering large Kubernetes clusters.
Knowledge or experience with Terraform and Cloud Formation Infrastructure automation tools.
Experience with security and compliance, such as FAPI, or technologies like JWT and oauth.
Knowledge or experience working with banking technologies.
Knowledge or experience working with containerization technologies, including Docker.
Knowledge or experience working with ML Tools such as Sage Maker, MLFlow, PyTorch, TensorFlow