Experience: 6+ years of experience as a Site Reliability Engineer, DevOps Engineer, or in a similar role with hands-on experience in cloud-native infrastructure.
- Kubernetes Expertise: Strong expertise in managing and scaling Kubernetes clusters, including experience with Kubernetes networking, storage, and multi-cluster architectures.
- AWS Cloud Expertise: Proficiency with AWS services such as EC2, S3, EKS, RDS, VPC, Lambda, IAM, CloudWatch, and others.
- Experience with AWS best practices for scalability, security, and cost management.
- Infrastructure as Code (IaC): Hands-on experience with IaC tools such as Terraform, AWS CloudFormation, or Ansible for provisioning and managing cloud infrastructure.
- CI/CD Pipelines: Experience building and maintaining continuous integration and continuous deployment (CI/CD) pipelines using Jenkins, GitLab CI, or similar tools.Scripting &
- Automation: Proficiency in scripting languages such as Python, Bash, or Go to automate operational tasks and improve workflows.
- Monitoring & Logging: Experience with monitoring, logging, and alerting tools like Prometheus, Grafana, CloudWatch, ELK stack, or similar tools.
- Troubleshooting &
- Incident Management: Ability to troubleshoot complex issues in distributed systems, conduct root cause analysis, and implement solutions to prevent recurrence.
- Collaboration Skills: Strong communication skills with the ability to work collaboratively with developers, operations, and product team