Tagcor

SRE Engineer

Formal training or certification on site reliability engineering concepts and 3+ years applied experience

Proficient in site reliability culture and principles and familiarity with how to implement site reliability within an application or platform
Proficient in Java/Spring Boot
Proficient knowledge of software applications and technical processes within a given technical discipline (e.g., Cloud, artificial intelligence, Android, etc.)
Some experience in observability such as white and black box monitoring, service level objective alerting, and telemetry collection using tools such as Grafana, Dynatrace, Prometheus, Datadog, Splunk, and others
Experience with continuous integration and continuous delivery tools like Jenkins, GitLab, or Terraform
Familiarity with container and container orchestration such as ECS, Kubernetes, and Docker
Familiarity with troubleshooting common application, networking and issues