Hy all,
reading
https://community.librenms.org/t/alert-rule-for-amount-of-devices-down/22701
and making my own “Advanced Alerting Rules” I would like to start a library thread with smart and useful advanced alerting rules that might be worth sharing.
Please contribute and share your craziest SQL Queries, which actual solve some tricky alerting requirements.
To make a good start I offer the first two Alert Rules:
- Alert (for a device) if the device is down and there are 10 (adjustable) or more devices down at the same time. Can be refined by SQL Statement to match for decent hostnames of a Hypervisor Cluster, Location, …
SELECT *
FROM devices
WHERE (devices.device_id = ?)
AND devices.status=0
AND ((SELECT count(hostname) from devices where status=0) >= 10);
- Alert (for a device) Named $FOO-CPE-A if the corresponding Partner (so $FOO-CPE-B but not $BAR-CPE-B) is also down.
Can be used for checking that at least 1of2 devices delivering a service is online.
SELECT *
FROM devices AS d1
WHERE d1.device_id = ?
AND d1.status = 0
AND (
SELECT COUNT(*)
FROM devices AS d2
JOIN devices AS d3 ON LEFT(d2.hostname, LENGTH(d2.hostname) - 7) = LEFT(d3.hostname, LENGTH(d3.hostname) - 7)
WHERE d2.hostname LIKE '%-CPE-A' AND d2.status = 0
AND d3.hostname LIKE '%-CPE-B' AND d3.status = 0
) > 0;
Happy alerting!