Tagcor

Required Skills

Python Java Bash Go JavaScript

Work Authorization

H1B Work Permit

Preferred Employment

Contract to Hire

Employment Type

Consulting/Contract

education qualification

UG :- - Not Required
PG :- - Not Required

Other Information

No of position :- ( 1 )
Post :- 22nd Dec 2023

JOB DETAIL

What you’ll do

As a Site Reliability Engineer - Lead, you will manage complex system(s) uptime across cloud-native (AWS, GCP) and hybrid architectures.
Create and maintain cloud infrastructure capacity plans using estimation models that meet the expected service level objectives of the system(s). Operate systems at an optimal cost while maintaining availability targets.
Design and build infrastructure as code (IAC) patterns that meet security and engineering standards using one or more technologies (Terraform, scripting with cloud CLI, and programming with cloud SDK).
Build CI/CD pipelines for the build, test, and deployment of application and cloud architecture patterns, using platform (Jenkins) and cloud-native toolchains.
Leads DevSecOps operational practices and designs solutions that improve the resilience of products/services.
Create the communication narrative to influence product, engineering, security, Cloud CoE, and customers for reliability and uptime issues and improvements.
Collaborate with Business Units and other team members to set SLAs based on past requirements.
Collaborate and contribute to virtual teams in an open-source/inner-source engineering model.
Lead availability blameless postmortem and own the call to action to remediate recurrences.
Build relationships and contribute technically alongside internal/vendor engineering teams.
Influence the product roadmaps for external services that systems/services are dependent upon.
Effectively communicate to technical peers and team members in both written and verbal formats.
What experience you need

BS degree in Computer Science or related technical field involving coding (e.g., physics or mathematics), or equivalent job experience required.
4+ years of experience developing and/or administering software in a public cloud (GCP or AWS).
7+ years experience in languages such as Python, Java, Bash, Go, JavaScript and/or Node.js.
7+ years of system administration skills, including automation and orchestration of Linux/Windows using Terraform, Chef, Ansible, and/or containers (Docker, Kubernetes, etc.)
7+ years of demonstrable cross-functional knowledge with systems, storage, networking, security, and databases.
7+ years experience in monitoring infrastructure and application uptime and availability to ensure functional and performance objectives.
7+ years of proficiency with continuous integration and continuous delivery tooling and practices.
7+ years of expertise designing, analyzing, and troubleshooting large-scale distributed systems.
What could set you apart
You take a system problem-solving approach, coupled with strong communication skills and a sense of ownership and drive.
Experience managing Infrastructure as code via tools such as Terraform or CloudFormation.
Passion for automation with a desire to eliminate toil whenever possible.
You’ve built software or maintained systems in a highly secure, regulated, or compliant industry.
Experience and passion for working within a DevOps culture and as part of a team.
Proficiency with continuous integration and continuous delivery tooling and practices.

Site Reliability Engineer