Senior Site Reliability Engineer (SRE) - Monitoring
Required Skills
SRE
Work Authorization
US Citizen
Green Card
EAD (OPT/CPT/GC/H4)
H1B Work Permit
Preferred Employment
Corp-Corp
W2-Permanent
W2-Contract
Contract to Hire
Employment Type
Consulting/Contract
education qualification
UG :- - Not Required
PG :- - Not Required
Other Information
No of position :- ( 1 )
Post :- 29th Nov 2024
JOB DETAIL
Monitoring & Observability: Leverage tools such as DataDog, Prometheus, and Grafana to track system performance and ensure the health of applications and infrastructure. Create, manage dashboards, and configure alerts.
CI/CD Pipelines Integration: Integrate DataDog with CI/CD pipelines to automate monitoring across environments and development cycles.
Cloud Integration: Manage AWS integrations with DataDog, ensuring seamless and unified monitoring across cloud platforms.
Container & Microservices Monitoring: Expertise in monitoring containerized environments using Kubernetes and integrating with DataDog for performance and health metrics.
Scripting & Automation: Automate monitoring tasks and configure DataDog via Python scripts to optimize the monitoring workflows.
Required Skills:
Hands-on experience with DataDog's monitoring, logging, and APM tools.
Cloud Platform Expertise: Experience working with AWS and integrating cloud services with DataDog.
Container Monitoring: Strong experience with Kubernetes and Docker in a microservices architecture, integrated with DataDog.
Scripting & Automation: Proficient in Python for task automation and platform configuration.
Experience in monitoring & observability across multiple cloud services and infrastructure.
Ability to configure DataDog agents, manage integrations, and ensure secure access with API keys or tokens.
Knowledge of user management and role-based access in DataDog.