Bachelor's degree in Computer Science or related fields, with minimum 5+ years of relevant experience in developing and maintaining large-scale distributed SRE platform/tooling with automation.
Solid programming skills, mastering at least one of the programming skills such as Go/Java/Python/Shell, and being able to deliver high quality code; Familiar with at least one of the web frameworks, such as Gin/Django/Spring, with a decent understanding of their design principles.
Experienced and hands-on skills in debugging, troubleshooting and optimization of sophisticated distributed systems and platforms
Deep understanding of OS (Linux, windows), Network (TCP/IP, HTTP, etc), with good exposure to network, storage, as well as computer architecture.
Familiar with Redis/MySQL/PostgreSQL database architecture and working principle, familiar with daily operation and maintenance including but not limited to high availability cluster construction, monitoring, backup, fault handling, Performance optimization.
Familiar with cloud native framework with experience in Kubernetes. Good experience with SRE tools such as Ansible, ELK, Prometheus and Grafana.