Hi I have some alerts set up for high cpu. My colleague asked if I could make it where we only get an alert if the high cpu is continuing for a certain length of time. Severs spike here and there so we don’t want to receive frivolous alerts. If the high cpu is sustained, then we would want to only know in that case.
Is there a way to do that?
[OK] Composer Version: 1.7.2
[OK] Dependencies up-to-date.
[OK] Database connection successful
[OK] Database schema correct
In the rule, delay field is what you are looking for. Mouse over and see the hint.
A few more questions then…
If I set my delay to 30 min and my interval to 10 min, does that mean that the server would need to have 30 minutes of high cpu before sending out the first alert and then librenms will recheck if it needs to send out another alert every 10 minutes after?
Also what if the server is only doing 4 or 5 10 sec spikes in CPU? So at the end of the 30 minutes, if the server was doing another 10 sec CPU spike, librenms would send out an alert but it would not necessarily mean that the server had been having 30 minutes straight of high CPU? Is that correct?
Last, does this mean librenms is only actually checking every 30 minutes for the first alert?
Polling happens every 5 minutes. Alerts are checked at that time. If it clears before the 30 minutes, it is silently cleared and the timer is reset.
In other words, you would only get notified if the alert is active for about 6 device polls.
For interval, after the initial 30 minutes, you will get an notification every 10 minutes as long as the alert is active for every consecutive poll.
Also, most devices represent an average cpu usage, so spikes won’t trigger it.
So does it somehow keep track of how long the server had high cpu for?
What if the server only had high cpu for 5min but the poller caught it during the last 5 min of the 30 minute delay? Would librenms still alert or would it realize that the high cpu was only for the last poll in the 30 minute window?
I changed my settings and received an alert. When I looked at the graph, there was not high cpu for 30 min, just within the last 5 minutes only.
Basically it seems like the delay is working.
Has anyone made this delay work for cpu that is not just a delay but actually only alerts if the server has had high cpu for 30 min straight?
That is exactly what it does.
I think that Martinetwork would like to avoid seeing the alert in the Alert page of Librenms at all. And this is not possible as far as I know. The alert will appear in LibreNMS but would trigger an email as long as the delay of 30 minutes is not over.
On the other hand, if the admin sees the console in real time, there is no way to actually know if it is a real alert (means > 30 mins) or not, unless you open the alert an read de timestamp and duration params.
It does not seem to be doing so… Please see images showing my alert configuration is for 30 min. This alert came in at 12:45PM but if you look at the CPU graph it does not show 30 min of high CPU.
FYI, the delay only applies to notifications. The WebUI always shows all active alerts.