Required Skills

SRE

Work Authorization

  • US Citizen

  • Green Card

  • EAD (OPT/CPT/GC/H4)

  • H1B Work Permit

Preferred Employment

  • Corp-Corp

  • W2-Permanent

  • W2-Contract

  • Contract to Hire

Employment Type

  • Consulting/Contract

education qualification

  • UG :- - Not Required

  • PG :- - Not Required

Other Information

  • No of position :- ( 1 )

  • Post :- 1st May 2025

JOB DETAIL

  • Leading the AMS card operations team towards SRE and Automation Efforts for Fraud Authorization and authentication area.
  • Responsible for creating solutions for Observability by building APP dynamics single-glass-pane dashboard for both Fraud Authorization and authentication Area (FAA) and Collections Value stream (CVS).
  • Build single glass pane Dashboards in Datadog for Collections Value stream.
  • Worked on migration of alerts for Collection Value Stream alerts from app dynamics to Datadog.
  • Work on migration of dashboards from AppDynamics to Datadog and created monitors for Recovery application in Datadog.
  • Create Service now reports for incidents, RRT’s Problem tickets, Deployment reports for tracking and leadership visibility for FAA and CVS area.
  • Create Aggregator framework for Incident monitoring to reduce the false positives and reduction in noise generated incidents. Reduced 1500 False positive tickets.
  • Work on creating automation for IRIS health check and designed framework for automated reporting database issues to recovery DBA team.
  • Created Runbooks and postmortem reports for all the alerts created and RRT’s.
  • Navigation of issue triaging would take 30-40 minutes and after observability framework is implemented the AIL MTTD improved by 85% by bringing down issue detection to 5-10 minutes..
  • Created SRE roadmap on Enterprise level, introduced SLI/SLO concepts on component level, Gremlin on Chaos Engineering.
  • Scrum Master for SRE sprint plans, risk identification and mitigation, capacity and velocity planning.
  • Working on Creating chaos engineering scenarios and assisting team in getting gremlin agents installed on pre-prod servers.
  • Guided team in implementing automation for critical automation like SAS rule validation and deployment.

Company Information