Required Skills

Site Reliability Engineer

Work Authorization

  • US Citizen

  • Green Card

Preferred Employment

  • Corp-Corp

  • Contract to Hire

Employment Type

  • Consulting/Contract

education qualification

  • UG :- - Not Required

  • PG :- - Not Required

Other Information

  • No of position :- ( 1 )

  • Post :- 25th Feb 2023

JOB DETAIL

1) SRE is responsible for identifying root cause issues for applications experiencing high availability / performance issues in AWS cloud environment, determine solutions and implement to resolve. 
2) AWS Native Services: 100% AWS cloud environment, MUST HAVE: EC2, Cloud Watch, Cloud Trail, Elastic Load Balancer, Auto Scaling, S3, Command Line Interface (CLI), Lambda 
2) Performance Engineering & Performance Testing – Validating and Testing existing pipelines / performance of systems using LoadRunner / JMeter. Focus on load balancing, fault tolerance, high availability, resiliency and capacity. 
3) Monitoring Tools: Splunk (create / manipulate dashboards), DataDog, Dynatrace, DataDog or similar tools
4) Java Development - Code changes to Java applications

Secondary Skills - Nice to Haves

Job Description

The SRE will serve as a champion of service availability, efficiency, load balancing / capacity management, automation, and monitoring. Ensure both internal systems and external systems meet performance needs of users. Enable increased feature velocity and continuous improvement. Identify gaps in the code from a non-functional viewpoint and experience assisting other developers to fix the code and promote relevant reliability pattern implementations. Must be skilled in cloud technologies and cloud computing for AWS environment.

Additional Skills & Qualifications 

Technologies & How They Are Used:
1) JMeter / LoadRunner - Performance Testing & Engineering, script enhancements. Execute Load & Stress tests, identify performance bottle necks, engage AppDev team to fix issues. 
2) Splunk / Dynatrace / Datadog, etc. - Design, Build, Implement dashboards on Application & Infrastructure health. Monitoring Application performance. 
3) CI/CD & Jenkins: Configured pipelines in Jenkins, trigger automated jobs in pipeline using Shell scripts. 
4) Terraform / CloudFormation: Use existing Terraform, Ansible or CloudFormation templates / playbooks for infrastructure provisioning and configuring in AWS. DO NOT NEED TO BUILD FROM SCRATCH 
5) Linux: Identify issues in applications and fix issues in Linux servers 
6) Docker / Microservices: Configure Docker images for Microservices in AWS ECS and develop Groovy or Shell scripts. Upload Docker image to JFrog Artifactory. Configure applications in Kubernetes. 
7) Java / J2EE Apps: Web and App server configuration, JVM parameters tuning, Message broker, JVM analyzing garbage collection and thread analysis. 
8) Defining, measuring, and improving Reliability Metrics (SLO/SLI) - Meet Product Owners, showcase deviations and impacts across application.


SCREENING QUESTIONS: 


1) How many years of HANDS - ON public cloud experience do you have? Hands-on means keyboard type of work 
2) How would you set up Cloud Watch and monitory logs / alerts? Walk me through this step by step
3) How would you trigger Lambda functions? 
4) What is AUTHENTICATION and AUTHORIZATION in AWS? How do you setup hands on? Walk me through step by step

 

Company Information