We are running LibreNMS to monitor a network with alot of sites with switches and routers. The WAN that connects to the routers can be flakey so we get quite a few false positves with devices going down, alerts being sent and then recovering and a recovery being issued quite soon after.
We mostly use email as the transport and we receive alot of device down and recovery emails. This can lead to mistakes as hthe ammount of noice means an operator dosn’t notice a device didn’t issue a recovery email.
We would like to implement something where if the device hasn’t recovered and been acknowledged a higher priority alert is issued so the problem can be escalated. My thinking was to transport it to something like Alerta so that it can be correlated and the priority raised if the state has continued.
I found the folowing when researching if this has all ready been implemented.
Specifically the Routes and the monritoring of alerts themselves if things have been acknowledged. The Github issue shows that some of this functionality was being tested;
I have also seen laf merged the code into the alerts.inc.php
The function that looks interesting is RunFollowUp
- Run Follow-Up alerts
This appears to keep a track of alerts and work out if they have got better or worse, which we could then use for escaltion purposes.
However, I can’t see this function used anywhere else. So I have a question;
This appears to be in the code but I’m not sure how we go about invoking it.
Can this be done from the Alert Rule builder, a template, some sort of macro? Or do we have need to take this futher with php?