Design, implement, and manage robust CI/CD pipelines to automate code deployment and delivery processes.
Develop and maintain Infrastructure as Code (IaC) using tools such as Terraform, CloudFormation, or Ansible.
Automate configuration management, application deployments, and orchestration using tools like Ansible, Chef, and Puppet.
Administer and optimize Linux-based environments (RHEL, CentOS, Ubuntu) for both on-premises and cloud infrastructures.
Manage containerization platforms (Docker) and orchestration tools (Kubernetes) to ensure scalability and reliability of applications.
Implement monitoring, logging, and alerting solutions to maintain high availability and performance using tools like Prometheus, Grafana, ELK Stack, or Splunk.
Troubleshoot and resolve infrastructure-related issues in development, test, and production environments.
Collaborate closely with development, QA, and IT teams to ensure seamless integration and deployment of applications.
Ensure systems comply with security standards and best practices; automate security patching and vulnerability remediation.
Mentor and guide junior DevOps engineers, fostering a culture of continuous learning and improvement.
Good to have skills
Bachelor’s degree in Computer Science, Information Technology, or a related field, or equivalent practical experience.
5+ years of experience in DevOps, Site Reliability Engineering (SRE), or related roles.
Strong proficiency in Linux system administration, including troubleshooting, networking, and shell scripting.
Extensive experience with automation and configuration management tools such as Ansible, Chef, Puppet, or SaltStack.
Hands-on experience with cloud platforms (AWS, Azure, GCP) and cloud-native services.
Solid experience with containerization and orchestration tools (Docker, Kubernetes).
Expertise in CI/CD tools such as Jenkins, GitLab CI/CD, or CircleCI.
Proficiency in scripting and programming languages such as Bash, Python, or Go.
Experience with infrastructure monitoring, logging, and alerting tools (Prometheus, Grafana, ELK Stack, Splunk).
Strong understanding of networking concepts, security best practices, and infrastructure as code (IaC).
Excellent problem-solving skills, with a focus on automation, reliability, and scalability.
Strong communication skills and ability to work collaboratively in a team environment.