Acknowledgements disappear after a short time

When I acknowledge an alert, it stops alerting/notifying for a short period of time, but then starts alerting again with no recovery. Is there something that causes acknowledgements to expire?

Thanks! Any help/hints much appreciated.

Validate output (fping6 fails because I have v6 disabled):

/opt/librenms/validate.php
====================================
Component | Version
--------- | -------
LibreNMS  | 1.33-124-gde35e6e
DB Schema | 215
PHP       | 7.0.22-0ubuntu0.16.04.1
MySQL     | 10.0.31-MariaDB-0ubuntu0.16.04.2
RRDTool   | 1.5.5
SNMP      | NET-SNMP 5.7.3
====================================

[OK]    Database connection successful
[OK]    Database schema correct
sh: 1: 1: not found
[FAIL]  fping6 could not be executed. fping6 must have CAP_NET_RAW capability (getcap) or suid. Selinux exlusions may be required.
 (::1: error while sending ping: Cannot assign requested address ::1: error while sending ping: Cannot assign requested address ::1: error while sending ping: Cannot assign requested address ::1: error while sending ping: Cannot assign requested address ::1 is unreachable)
[FAIL]  fping6 should have CAP_NET_RAW!
        [FIX] setcap cap_net_raw+ep /usr/bin/fping6

Host logs:

  2017-11-17 18:01:04	System	Issued acknowledgment for rule 'Memory over 85%' to transport 'pushover'
  2017-11-17 17:56:02	System	Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
  2017-11-17 17:51:01	System	Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
  2017-11-17 17:46:04	System	Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
  2017-11-17 17:41:02	System	Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
  2017-11-17 17:23:02	System	Issued acknowledgment for rule 'Memory over 85%' to transport 'pushover'
  2017-11-17 17:21:01	System	Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
  2017-11-17 17:16:02	System	Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
  2017-11-17 17:11:02	System	Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
  2017-11-17 17:06:02	System	Issued warning alert for rule 'Memory over 85%' to transport 'pushover'

see here https://docs.librenms.org/#Support/FAQ/#why-would-alert-un-mute-itself

Also, because the alert is not clearing the poller goes back around it sees that alert rule is still matching

Thanks Kevin. From the FAQ you sited:

If alert un-mutes itself then it most likely means that the alert cleared and is then triggered again. Please review eventlog as it will tell you in there.

I’ve posted my event log and I don’t see any signs of the alert being “cleared”. Or maybe I’m not understanding.

Also, because the alert is not clearing the poller goes back around it sees that alert rule is still matching

Sorry, I don’t understand this, you’re saying that it will alert despite the acknowledgement because the alert rule still matches?

How might I prevent this from happening? The behavior I’m expecting is that an acknowledged alert will stop notifying once it has been acknowledged unless the state changes - is this expectation wrong?

When you acknowledge the alert all it does is mute the check. Until the “interval” expires. You won’t see a clear or “recovery” because technically the alert is still matching the device.

check your rule “Interval”

also would help if you could post a screenshot of the alert rule.

OK, so it only mutes it for the time specified in “interval”? It sounds like you’re saying it should alert again after 5 minutes. It seems like I get a random amount of time after acknowledging before it un-mutes - could be hours, could be 20 minutes.

ok i see what you are saying you mute the rule but unmutes at random times?

Also are you okay with the -1 ? that cause unlimited amount of alerts to trigger. You could change it to 1. So it will only send out one alert and then if the rule recovers you get the recovery.

I actually want it to keep alerting until someone acknowledges it. I guess I can put a limitation on it if acknowledgements aren’t going to stick, but I’d rather not.

Here’s some more of the event log so you can see the random gaps between acknowledgement and the next alert:

(Thanks again Kevin!)

  2017-11-17 18:01:04	System	Issued acknowledgment for rule 'Memory over 85%' to transport 'pushover'
  2017-11-17 17:56:02	System	Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
  2017-11-17 17:51:01	System	Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
  2017-11-17 17:46:04	System	Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
  2017-11-17 17:41:02	System	Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
  2017-11-17 17:23:02	System	Issued acknowledgment for rule 'Memory over 85%' to transport 'pushover'
  2017-11-17 17:21:01	System	Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
  2017-11-17 17:16:02	System	Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
  2017-11-17 17:11:02	System	Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
  2017-11-17 17:06:02	System	Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
  2017-11-17 17:01:09	System	Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
  2017-11-17 16:56:02	System	Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
  2017-11-17 16:51:02	System	Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
  2017-11-17 16:46:01	System	Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
  2017-11-17 16:41:02	System	Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
  2017-11-17 16:36:01	System	Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
  2017-11-17 16:31:02	System	Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
  2017-11-17 16:26:02	System	Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
  2017-11-17 16:21:02	System	Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
  2017-11-17 16:16:01	System	Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
  2017-11-17 16:11:02	System	Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
  2017-11-17 05:14:01	System	Issued acknowledgment for rule 'Memory over 85%' to transport 'pushover'
  2017-11-17 05:11:02	System	Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
  2017-11-17 05:06:01	System	Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
  2017-11-17 05:01:06	System	Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
  2017-11-17 04:56:01	System	Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
  2017-11-17 04:51:02	System	Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
  2017-11-17 04:46:03	System	Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
  2017-11-17 04:41:01	System	Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
  2017-11-17 04:36:02	System	Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
  2017-11-17 04:31:02	System	Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
  2017-11-17 04:26:01	System	Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
  2017-11-17 04:21:01	System	Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
  2017-11-16 20:47:02	System	Issued acknowledgment for rule 'Memory over 85%' to transport 'pushover'
  2017-11-16 18:24:02	System	Issued acknowledgment for rule 'Memory over 85%' to transport 'pushover'
  2017-11-16 18:21:02	System	Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
  2017-11-16 18:16:01	System	Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
  2017-11-16 18:11:01	System	Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
  2017-11-16 17:57:01	System	Issued acknowledgment for rule 'Memory over 85%' to transport 'pushover'
  2017-11-16 17:56:01	System	Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
  2017-11-16 17:51:01	System	Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
  2017-11-16 17:46:02	System	Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
  2017-11-16 17:41:01	System	Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
  2017-11-16 17:36:02	System	Issued warning alert for rule 'Memory over 85%' to transport 'pushover'
  2017-11-16 08:04:02	System	Issued acknowledgment for rule 'Memory over 85%' to transport 'pushover'

I currently have 4 alerts acknowledged in my home install and they’ve been like that for a while now. I’ve never seen them un-ack and alert again.

We’d need a reproducible scenario to really look into this further.

1 Like

Hi All,

We are facing this issue. We have the rule setup as follows

The Alert trigger against a router that was down. The admin went and acknowledegde the alert, adding a note. However, even after acknowledging the Alert the ACK button goes back to Red from Blue and then issues a mail transport. It has done this a number of times.

2018-05-08 14:36:01 System Issued acknowledgment for rule ‘Devices up/down’ to transport ‘mail’
2018-05-08 14:35:49 System ldadmin acknowledged alert Devices up/down
2018-05-08 14:31:01 System Issued critical alert for rule ‘Devices up/down’ to transport ‘mail’
2018-05-08 13:31:01 System Issued critical alert for rule ‘Devices up/down’ to transport ‘mail’
2018-05-08 12:31:01 System Issued critical alert for rule ‘Devices up/down’ to transport ‘mail’
2018-05-08 11:31:01 System Issued critical alert for rule ‘Devices up/down’ to transport ‘mail’
2018-05-08 10:56:02 System Issued acknowledgment for rule ‘Devices up/down’ to transport ‘mail’
2018-05-08 10:55:04 System ldadmin acknowledged alert Devices up/down
2018-05-08 10:31:01 System Issued critical alert for rule ‘Devices up/down’ to transport ‘mail’
2018-05-08 09:49:01 System Issued acknowledgment for rule ‘Devices up/down’ to transport ‘mail’
2018-05-08 09:48:46 System ldadmin acknowledged alert Devices up/down
2018-05-08 09:31:01 System Issued critical alert for rule ‘Devices up/down’ to transport ‘mail’
2018-05-08 08:31:01 System Issued critical alert for rule ‘Devices up/down’ to transport ‘mail’

The alerts is currently acknownledged (kyn-r1)

The other two alerts have stayed acknowledged.

Any help?

You mean it goes back to unacked after a bit? If so this is most likely because that alert is clearing. Check the eventlog after the initial ack to see if it says the device is back up then down again

Hi Laf,

As you can see I have added the ‘Eventlog entries’ for the device from the logs tab. Recovery Alerts are turned on but there is no Recoverey Alert email Transported for the device. It simply goes back to Un-Acknowledged in the GUI and then starts to issue Alerts out of the trasnport after the 1h delay has expired.

Unless you mean I should check a different log apart from Eventlog entries on the device page?

Regards,

Duncan

No that’s the correct eventlog.

I have stuff marked as ACK’d and they’ve never un ACK’d themselves. Not really sure what to suggest.