Inscreva-se para aceder a todos os recursos do nosso serviço
  • Pesquisa de emprego
  • Favorito
  • Criar um CV
    Novo
  • Salários
  • Alertas de emprego

Site Reliability Engineer

Jobtome

Job Title: Site Reliability Engineer (SRE) / Infrastructure Operations MID LEVEL

Role Overview

Responsible for managing day-to-day infrastructure operations, including monitoring, alerting, and driving stability improvements across the environment.

Key Responsibilities

  • Monitor overall infrastructure health and system performance
  • Track key performance metrics such as CPU, memory, and disk utilization
  • Tune alerts to improve signal-to-noise ratio and reduce alert fatigue
  • Support disaster recovery (DR) rehearsals and readiness activities
  • Maintain and update runbooks, documentation, and operational reports

Required Experience

  • 4–6 years of experience in Site Reliability Engineering (SRE) or infrastructure operations
  • Hands-on experience with VMware environments
  • Experience with monitoring tools such as PRTG, Datadog, or similar platforms
  • Strong incident management experience, including response and resolution processes

Core Skills & Competencies

  • Solid understanding of infrastructure performance metrics (CPU, memory, disk, etc.)
  • Experience with alert tuning and optimization
  • Ability to proactively detect and troubleshoot performance issues
  • Strong incident management and operational response capabilities

Screening Signals

Look for candidates who:

  • Understand CPU Ready thresholds and their impact on performance
  • Have hands-on experience tuning alerts to reduce noise
  • Can proactively identify and resolve performance bottlenecks
  • Demonstrate strong incident management experience in production environments
Vaga publicada Há 2 meses atrás

Deseja receber mais vagas?

Assine e receba vagas semelhantes a Site Reliability Engineer. Seja o primeiro a se candidatar!