US Citizen
Green Card
Corp-Corp
Consulting/Contract
UG :- - Not Required
PG :- - Not Required
No of position :- ( 1 )
Post :- 26th Aug 2022
· Work with teams across an organization and ensure core services reliability and keep an eye on capacity and performance.
· Responsible for blameless postmortems and proactive identification of potential outages factor into iterative improvement.
· Work closely with development and operations teams to build highly available, cost effective systems with extremely high uptime metrics.
· Hands on experience Configuring and Administering SCM(GIT, SVN), Build (CMake, Make files, Maven), Nexus, CI(Jenkins), CD Automation Tools
· Responsible for establishing end-to-end monitoring and alerting on all critical aspects to ensure SLAs and get proactive notifications of possible issues for all systems.
· Work with the cloud operations team to resolve trouble tickets, developing and running scripts, and troubleshooting.
· Participate in 24x7X365 an on-call support for multiple core platforms globally. Using a “Follow the Sun” model, we expect working patterns will include on call duty, weekend and holiday season cover.
· Participate in release cycles of our offerings, deploying code to integration, staging and production environments, integrating with continuous integration (CI) and continuous delivery (CD) tools, monitoring, and change management
· Build Automation Work with Agile development teams to ensure smooth promotion of code, configuration and Docker images to production
· Oversee and adapt monitoring and alerting systems. Interact with automated monitoring and healing infrastructure to ensure healthy environments
· Develop automation to auto-correct or completely prevent issues in our solutions
· Perform software updates, peer code reviews, testing, and Common Vulnerabilities and Exposures (CVE) analysis; respond to security threats
· Identify single points of failure and other high-risk architecture issues; propose and implement more resilient resolutions
· Create and maintain standard operating procedures (SOPs) for performing maintenance tasks, applying configuration changes, and remediating problems in our environment
· Identify potential process improvements across the entire engineering organization
· Define and drive architectural enhancements into system to mitigate potential failure points
· Provide impact assessment and mitigation plan for changes going into the production environment
· Investigate root cause of severe and systemic outages, identify corrective actions
· As we transition to the Public cloud (AWS or Google), build new build and deployment patterns.
What experience you need:
· A minimum 10 years of experience as a Developer/Lead/Architect.
· Bachelor's Degree in Computer Science, Information Management or in “STEM” Majors
· Experience with configuring, customizing, and extending monitoring tools (Appdynamics, Apica, Sensu, Grafana, Prometheus, Graphite, Splunk, Zabbix, Nagios etc.)
· 10+ years’ experience with all stages of an agile software development lifecycle (CI/CD) supporting Java/Javascript UI applications (ex: Angular JS) and SAAS applications.
· 5 years of experience building JavaEE applications using, build tools like Maven/ANT, Subversion, JIRA Jenkins, Bitbucket and Chef
· 8+ years’ experience in continuous integration tools (Jenkins, SonarQube, JIRA, Nexus, Confluence, GIT-BitBucket, Maven, Gradle, RunDeck, is a plus)
· 3+ years’ experience with configuration management and automation (Ansible, Puppet, Chef, Salt)
· 3+ years’ experience deploying and managing infrastructure on public clouds (AWS, GCP, or Azure or Pivotal)
· 3+ years experience working on Kubernetes and other related applications.
· Experience working with Nginx, Tomcat, HAProxy, Redis, Elastic Search, MongoDB, RabbitMQ, Kafka, Zookeeper.
· 3+ years’ experience in Linux environments (CentOS).
· Knowledge of TCP/IP networking, load balancers, high availability architecture, zero downtime production deployments. Comfortable with network troubleshooting (tcpdump, routing, proxies, firewalls, load balancers, etc.)
· Demonstrated ability to script around repeatable tasks (Go, Ruby, Python, Bash)
· Experience with large scale cluster management systems (Mesos, Kubernetes)
· Experience with Docker-based containers is a plus
· Able to dive into any level of a modern internet service (schedulers, containers, Linux kernel, caching, object storage, distributed file systems, RDBMS, NoSQL, etc.)
--