5-10 years of experience in Production support/SRE teams with continued focus on improving Platform health
Experience working in Micro service architecture.
Hands-on Java coding exp and able to analyze and trouble shoot production issues by reading stack trace and exceptions.
Familiar with Agile or other rapid application development practices
Hands-on expertise in building monitoring dashboards and setting up alerts using Splunk.
Hands-on experience in writing Oracle SQL queries and MongoDB queries.
Experience with distributed (multi-tiered) systems, algorithms, and relational databases.
Must have working knowledge of APM tools such as splunk, ELK, Grafana, Prometheus etc
Knowledge & Exposure caching tools (Redis, memcache) or messaging tools such as MQ, Kafka is a plus
Working knowledge of CICD is a plus – Source control like Git/Bitbucket , Continuous Integration – Jenkins / UCD Release etc .
Ability to work with Engineering teams across the ecosystem such as Security , Networking & Infrastructure challenges which can impact platform health & resiliency.
Shell Scripting / DevOps tools like Ansible with good knowledge of yaml file to write playbooks .
Experience with distributed storage technologies like NFS as well as dynamic resource management frameworks PCF, Kubernetes / OpenShift.
A proactive approach to spotting problems, areas for improvement, and performance bottlenecks.