Required Skills

Site Reliability Engineer

Work Authorization

  • US Citizen

  • Green Card

  • EAD (OPT/CPT/GC/H4)

  • H1B Work Permit

Preferred Employment

  • Corp-Corp

  • W2-Permanent

  • W2-Contract

  • Contract to Hire

Employment Type

  • Consulting/Contract

education qualification

  • UG :- - Not Required

  • PG :- - Not Required

Other Information

  • No of position :- ( 1 )

  • Post :- 10th Jul 2024

JOB DETAIL

1.       Automation and Scripting:

·       Develop and maintain scripts to automate tasks and processes related to performance, scalability, and resilience.

·       Implement automation solutions to streamline operational workflows and reduce manual intervention.

1.       Issue Triage and Resolution:

·       Triage and resolve issues affecting the platform's performance and stability.

·       Take ownership and accountability for the overall performance and reliability of the platform.

1.       Tracking and Management:

·       Create and manage Jira tickets to track and resolve issues efficiently.

·       Ensure timely updates and closure of tickets to maintain workflow transparency.

1.       System Monitoring and Health:

·       Monitor system health using SRE tools and proactively identify potential problems.

·       Utilize tools like Grafana, New Relic, and Kibana to monitor and analyze system performance metrics.

1.       Collaboration and Data Analysis:

·       Collaborate with various cross-functional teams to gather necessary data and insights for troubleshooting and optimization.

·       Build data reports using Python to provide actionable insights to stakeholders.

1.       Trend Monitoring and Operations:

·       Monitor trends in order processing and submission to ensure smooth operations.

·       Proactively address anomalies and issues to maintain high availability and reliability.

1.       End-to-End Support:

·       Provide end-to-end support to the business, ensuring high availability and reliability of the platform.

·       Communicate effectively with cross-functional teams to ensure seamless support and operations.

Skills Required:

·       Proficiency in using SRE tools such as Grafana, New Relic, and Kibana.

·       Strong scripting skills with experience in automation (e.g., Python, Shell scripting).

·       Experience in triaging and resolving performance and scalability issues for J2EE applications.

·       Ability to build and interpret data reports using Python.

·       Excellent problem-solving skills with a proactive approach to monitoring and maintenance.

·       Strong communication and collaboration skills to work effectively with cross-functional teams.

·       Preference for candidates with a background in DevOps practices and methodologies.

Qualifications:

·       Bachelor’s degree in Computer Science, Information Technology, or a related field.

·       10+ years of overall IT experience, with at least 5 years specifically in a Site Reliability Engineer or similar role.

·       Proven experience in maintaining and supporting high-availability production environments.

·       Familiarity with J2EE application architecture and performance tuning.

·       Strong analytical skills and attention to detail.

Work Environment:

·       This position requires working from the office Hybrid

·       work may be required to support critical production incidents or project milestones.

·       Participation in an on-call rotation schedule to address critical production issues.

Company Information