Responsible for the operations, monitoring, and management of the Splunk infrastructure and services
Investigate, diagnose, and remediate NOC incidents
Manage NOC incidents lifecycle in ServiceNow
Lead incident triage efforts in collaboration with development teams
Develop, enhance, and maintain the NOC playbooks
Responsible for the continuous Improvement of application monitoring and process automation
Collect Evidence for compliance audits
Assist in SOC investigations if needed
Proactive and self-motivated with a keen sense of ownership and accountability.
Overseeing and resolving infrastructure, application, and database issues in a large-scale AWS environment.
Technical excellence. Use continuous delivery, testing, and security standard methodologies.
Operational excellence. Make decisions based on numbers rather than assumptions. If an issue arises, you strive to be alerted before our customers notice.
Keeping calm and carrying on. Capable of brainstorming product outages, skilled in identifying performance bottlenecks, spotting anomalous system behavior, and determining root cause of incidents.
Commit to automation. Passionately embrace and master modern technologies to help automate routine tasks and free up time for innovation. You will be working with a variety of languages used in systems programming like Go, Python, Terraform etc.