Using prometheus to alert on
The snippet can be accessed without any authentication.
Authored by
Kevin Lyda
Dump the output of this into /var/lib/node_exporter/textfile_collector/mdadm-monitor.prom
and then add this to your alerting rules:
# Mdadm monitor not running. Last TestMessage event was 1800 seconds ago.
ALERT MdadmMonitorNotRunning
IF mdadm_monitor{event="TestMessage"} < (time() - 1800)
FOR 80m
ANNOTATIONS {
summary = "Mdadm Monitor has problems on {{$labels.instance}} failed.",
description = "Mdadm Monitor has problems on {{$labels.instance}} failed.",
}
# Mdadm monitor sees a problem. An event has fired in last 1800 seconds.
ALERT MdadmMonitorErrorDetected
IF mdadm_monitor{event!="TestMessage"} > (time() - 1800)
FOR 30m
ANNOTATIONS {
summary = "Mdadm Monitor has seen an event on {{$labels.instance}}.",
description = "Mdadm Monitor has an event ({{$labels.event}}) on {{$labels.instance}}.",
}
mdadm-monitor.sh 274 B
Please register or sign in to comment