Design, architect, and implement cloud-based solutions on AWS, Azure, or GCP.
Define and implement infrastructure-as-code (IaC) principles using Terraform to automate provisioning, configuration, and management of cloud resources.
Develop and maintain cloud infrastructure diagrams and documentation.
Site Reliability Engineering (SRE):
Implement and maintain SRE best practices, including monitoring, alerting, logging, and automation.
Develop and implement automation scripts for routine tasks and incident response.
Participate in on-call rotations and provide timely support for production systems.
Analyze system performance and identify areas for improvement.
Security & Compliance:
Ensure the security and compliance of cloud environments by implementing appropriate security controls and best practices.
Stay up-to-date with the latest security threats and vulnerabilities.
Adhere to industry best practices and compliance standards (e.g., SOC 2, ISO 27001).
Collaboration & Communication:
Collaborate effectively with software engineers, DevOps engineers, and other stakeholders.
Clearly communicate technical concepts to both technical and non-technical audiences.
Participate in design reviews and code reviews.
Research & Innovation:
Research and evaluate new technologies and tools to improve the efficiency and reliability of cloud operations.
Stay current on industry trends in cloud computing, SRE, and DevOps.