I am fed up with the false alerts

I am newer on the Librenms. I am fed up with the false alerts. I have gone through all the previous threads related to false alerts but did not get any solution. can someone help me with this?

Would help us if you explained what kind of alerts you have and more details about your alert setup.

I am getting an alert “device is down” but when I check it saws UP and working fine.

Okey, i personally dont use that rule, i use the rule with down and the reason is ping.
Why, no big reason, just feels like ping is most reliable.

Couple things to consider, made assuming everything is at a default state - What is your polling interval and are you over-polling your devices (are they able to finish)? Reasoning is if you have a 5m polling interval (default), your alert rule is set to delay 1m, so your devices might alert false-negative device downs because they aren’t through the polling cycle and there wont be another subsequent update for as many minutes as your polling interval to clear tell you its clear.
One idea to cut down on false-negatives is have your delay set to at least 2 polling periods. An easy fix would be have 1m polling and 2m delay on device downs. Another way as @Pizzahjul said you can give it the reason icmp, if you want to utilize icmp, highly recommend integrating Smokeping and going to 1m polling if you have not already if this alarm is critical.
Another issue is you can give your poller-wrapper more threads if you need to bring down the time to complete your polling interval, if your devices that are not able to finish their polling or you can also limit what you are polling your devices for and or a mix of the two.

1 Like

If I configure polling interval 1 and delay 2m.It would help me to reduce the false positives. like below snapshot.

This is not the polling interval but the interval for multiple notifications but because you only send 1 (Max alerts) it does nothing. If you have a 5min polling interval, give @walleyeguy’s recommendation, to set Delay to 10m, a try.

Let me check and get back to you.

@slalomsk8er is right on the mark- the interval field you are referencing is the alert interval-> how often it will send out an alert when in alarm state. If you set the “Max Alerts” to a a higher number or -1 for infinite, it will send an alert until it reaches its defined maximum, since it is currently set to 1 you would only see the initial and recovery alerts if configured in that manner.

If you want to go with 1 minute polling - check out this page -
1 Minute Polling

Another great resource if you are new to all of this is the performance optimizations -
Performance Optimizations

Between these two things, you can tweak your polling and resources to make it act in a way that fits your needs. I banged my head against the wall in the beginning working with alerts- so I can relate with the frustration. Stick with it though, LibreNMS is an awesome tool.

Also to note, you can have 1 minute icmp checks without doing 1 minute polling (which is a major resource hog). Fast Ping Checking - LibreNMS Docs

Some VMs (particularly with bonded ports) will cause duplicate ICMP packets to be sent out or received. fping does not like this and will throw an error. You should confirm that is not the case.

I’m using an interval of 5M and Delay of 10M in my alert rule which stopped the false alarms being emailed.

But while down email alerts are no longer sent out anymore which is good, unfortunately I still get an Alert log full of up and down messages.

This makes it difficult to show someone that the device we manage has been up all the time using Alert History file as it’s littered with up/down messages.

Never been able to figure out how to stop these messages being generate in the alert history since the devices where never down…they are false alerts.

If you are still getting entries in that list, it is likely then that your polling is not able to finish and the alarm reason should be SNMP. The system still has to say “yep device X did not fulfil this request during the given time”. The delay on the alert rule deals with the transporting of your alert to you and tells it how long before sending you the alert. If all you need to do is show connectivity, as @Pizzahjul had said maybe doing ping only is what you want to do?

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.