When I acknowledge an alert, it stops alerting/notifying for a short period of time, but then starts alerting again with no recovery. Is there something that causes acknowledgements to expire?
Thanks! Any help/hints much appreciated.
Validate output (fping6 fails because I have v6 disabled):
/opt/librenms/validate.php
====================================
Component | Version
--------- | -------
LibreNMS | 1.33-124-gde35e6e
DB Schema | 215
PHP | 7.0.22-0ubuntu0.16.04.1
MySQL | 10.0.31-MariaDB-0ubuntu0.16.04.2
RRDTool | 1.5.5
SNMP | NET-SNMP 5.7.3
====================================
[OK] Database connection successful
[OK] Database schema correct
sh: 1: 1: not found
[FAIL] fping6 could not be executed. fping6 must have CAP_NET_RAW capability (getcap) or suid. Selinux exlusions may be required.
(::1: error while sending ping: Cannot assign requested address ::1: error while sending ping: Cannot assign requested address ::1: error while sending ping: Cannot assign requested address ::1: error while sending ping: Cannot assign requested address ::1 is unreachable)
[FAIL] fping6 should have CAP_NET_RAW!
[FIX] setcap cap_net_raw+ep /usr/bin/fping6
Host logs:
2017-11-17 18:01:04 System Issued acknowledgment for rule 'Memory over 85%' to transport 'pushover'
2017-11-17 17:56:02 System Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
2017-11-17 17:51:01 System Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
2017-11-17 17:46:04 System Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
2017-11-17 17:41:02 System Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
2017-11-17 17:23:02 System Issued acknowledgment for rule 'Memory over 85%' to transport 'pushover'
2017-11-17 17:21:01 System Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
2017-11-17 17:16:02 System Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
2017-11-17 17:11:02 System Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
2017-11-17 17:06:02 System Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
If alert un-mutes itself then it most likely means that the alert cleared and is then triggered again. Please review eventlog as it will tell you in there.
Iâve posted my event log and I donât see any signs of the alert being âclearedâ. Or maybe Iâm not understanding.
Also, because the alert is not clearing the poller goes back around it sees that alert rule is still matching
Sorry, I donât understand this, youâre saying that it will alert despite the acknowledgement because the alert rule still matches?
How might I prevent this from happening? The behavior Iâm expecting is that an acknowledged alert will stop notifying once it has been acknowledged unless the state changes - is this expectation wrong?
When you acknowledge the alert all it does is mute the check. Until the âintervalâ expires. You wonât see a clear or ârecoveryâ because technically the alert is still matching the device.
check your rule âIntervalâ
also would help if you could post a screenshot of the alert rule.
OK, so it only mutes it for the time specified in âintervalâ? It sounds like youâre saying it should alert again after 5 minutes. It seems like I get a random amount of time after acknowledging before it un-mutes - could be hours, could be 20 minutes.
Also are you okay with the -1 ? that cause unlimited amount of alerts to trigger. You could change it to 1. So it will only send out one alert and then if the rule recovers you get the recovery.
I actually want it to keep alerting until someone acknowledges it. I guess I can put a limitation on it if acknowledgements arenât going to stick, but Iâd rather not.
Hereâs some more of the event log so you can see the random gaps between acknowledgement and the next alert:
(Thanks again Kevin!)
2017-11-17 18:01:04 System Issued acknowledgment for rule 'Memory over 85%' to transport 'pushover'
2017-11-17 17:56:02 System Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
2017-11-17 17:51:01 System Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
2017-11-17 17:46:04 System Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
2017-11-17 17:41:02 System Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
2017-11-17 17:23:02 System Issued acknowledgment for rule 'Memory over 85%' to transport 'pushover'
2017-11-17 17:21:01 System Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
2017-11-17 17:16:02 System Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
2017-11-17 17:11:02 System Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
2017-11-17 17:06:02 System Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
2017-11-17 17:01:09 System Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
2017-11-17 16:56:02 System Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
2017-11-17 16:51:02 System Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
2017-11-17 16:46:01 System Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
2017-11-17 16:41:02 System Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
2017-11-17 16:36:01 System Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
2017-11-17 16:31:02 System Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
2017-11-17 16:26:02 System Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
2017-11-17 16:21:02 System Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
2017-11-17 16:16:01 System Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
2017-11-17 16:11:02 System Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
2017-11-17 05:14:01 System Issued acknowledgment for rule 'Memory over 85%' to transport 'pushover'
2017-11-17 05:11:02 System Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
2017-11-17 05:06:01 System Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
2017-11-17 05:01:06 System Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
2017-11-17 04:56:01 System Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
2017-11-17 04:51:02 System Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
2017-11-17 04:46:03 System Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
2017-11-17 04:41:01 System Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
2017-11-17 04:36:02 System Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
2017-11-17 04:31:02 System Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
2017-11-17 04:26:01 System Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
2017-11-17 04:21:01 System Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
2017-11-16 20:47:02 System Issued acknowledgment for rule 'Memory over 85%' to transport 'pushover'
2017-11-16 18:24:02 System Issued acknowledgment for rule 'Memory over 85%' to transport 'pushover'
2017-11-16 18:21:02 System Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
2017-11-16 18:16:01 System Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
2017-11-16 18:11:01 System Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
2017-11-16 17:57:01 System Issued acknowledgment for rule 'Memory over 85%' to transport 'pushover'
2017-11-16 17:56:01 System Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
2017-11-16 17:51:01 System Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
2017-11-16 17:46:02 System Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
2017-11-16 17:41:01 System Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
2017-11-16 17:36:02 System Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
2017-11-16 08:04:02 System Issued acknowledgment for rule 'Memory over 85%' to transport 'pushover'
The Alert trigger against a router that was down. The admin went and acknowledegde the alert, adding a note. However, even after acknowledging the Alert the ACK button goes back to Red from Blue and then issues a mail transport. It has done this a number of times.
2018-05-08 14:36:01
System
Issued acknowledgment for rule âDevices up/downâ to transport âmailâ
2018-05-08 14:35:49
System
ldadmin acknowledged alert Devices up/down
2018-05-08 14:31:01
System
Issued critical alert for rule âDevices up/downâ to transport âmailâ
2018-05-08 13:31:01
System
Issued critical alert for rule âDevices up/downâ to transport âmailâ
2018-05-08 12:31:01
System
Issued critical alert for rule âDevices up/downâ to transport âmailâ
2018-05-08 11:31:01
System
Issued critical alert for rule âDevices up/downâ to transport âmailâ
2018-05-08 10:56:02
System
Issued acknowledgment for rule âDevices up/downâ to transport âmailâ
2018-05-08 10:55:04
System
ldadmin acknowledged alert Devices up/down
2018-05-08 10:31:01
System
Issued critical alert for rule âDevices up/downâ to transport âmailâ
2018-05-08 09:49:01
System
Issued acknowledgment for rule âDevices up/downâ to transport âmailâ
2018-05-08 09:48:46
System
ldadmin acknowledged alert Devices up/down
2018-05-08 09:31:01
System
Issued critical alert for rule âDevices up/downâ to transport âmailâ
2018-05-08 08:31:01
System
Issued critical alert for rule âDevices up/downâ to transport âmailâ
You mean it goes back to unacked after a bit? If so this is most likely because that alert is clearing. Check the eventlog after the initial ack to see if it says the device is back up then down again
As you can see I have added the âEventlog entriesâ for the device from the logs tab. Recovery Alerts are turned on but there is no Recoverey Alert email Transported for the device. It simply goes back to Un-Acknowledged in the GUI and then starts to issue Alerts out of the trasnport after the 1h delay has expired.
Unless you mean I should check a different log apart from Eventlog entries on the device page?