Required Skills

Python

Work Authorization

  • US Citizen

  • Green Card

  • EAD (OPT/CPT/GC/H4)

  • H1B Work Permit

Preferred Employment

  • Corp-Corp

  • W2-Permanent

  • W2-Contract

  • Contract to Hire

Employment Type

  • Consulting/Contract

education qualification

  • UG :- - Not Required

  • PG :- - Not Required

Other Information

  • No of position :- ( 1 )

  • Post :- 26th Feb 2025

JOB DETAIL

  1. Architecture & Design
    1. Design end-to-end Grafana solutions for metrics, logs, traces, and dashboards, ensuring scalability, security, and compliance.
    2. Architect integrations with Prometheus, Loki, Mimir, Tempo, and third-party tools (e.g., AWS CloudWatch, Datadog).
    3. Define best practices for Grafana deployment (self-managed vs. Grafana Cloud) and optimize data storage/retention strategies.
  2. SRE Leadership
    1. Implement SRE principles: SLAs/SLOs/SLIs, error budgets, and blameless post-mortems.
    2. Build automated monitoring/alerting systems to preemptively identify system bottlenecks and failures.
    3. Lead incident response, root cause analysis, and remediation for observability-related outages.
  3. Collaboration & Integration
    1. Partner with DevOps teams to embed Grafana into CI/CD pipelines and automate provisioning via IaC (Terraform, Ansible).
    2. Work with developers to instrument applications for observability (OpenTelemetry, custom exporters).
    3. Advise stakeholders on cost-effective monitoring strategies and resource optimization.
  4. Performance Optimization
    1. Tune Grafana dashboards, queries, and data sources for high-performance environments.
    2. Optimize PromQL/Loki LogQL queries and manage large-scale time-series databases (Mimir).
    3. Conduct capacity planning and disaster recovery testing for Grafana ecosystems.
  5. Governance & Security
    1. Ensure compliance with security policies (RBAC, SSO, encryption) and audit requirements.
    2. Monitor Grafana stack health, perform upgrades, and enforce version control.
  6. Mentorship & Innovation
    1. Mentor SRE/engineering teams on Grafana best practices and SRE culture.
    2. Stay ahead of Grafana/Observability trends and pilot new tools (e.g., AI-driven anomaly detection).

Company Information