False positive - icmp

Hey folks!

I’ve been using Librenms for some weeks now, and I can’t figure out how to resolve my problem.

Brief explaination of our infrastructure :

  • each server run under Ubuntu 18.04 LTS and Apache.
  • we have 1 librenms central server. No performance tunning, fresh install. This server is the database, RRD, and memcached server.
  • we have 1 poller, with librenms installed in every client we want to monitor. No performance tunning, fresh install. In config.php, we tells the poller to send data to the central server.

We use distributed polling mode, and each poller use central database server. Idem for RRD et MemCached.
Everything is working well, except that we have a lot of false positive with our remote monitored servers.

It is funny that each 1min, the status goes to down from icmp check after it says that devices is up from icmp check. And with a frequency of 4 minutes. So we have those cycles :

I really do not know how to make that work… and I am sure that my servers are up, so it comes from my configuration…

The only thing from the validate which is not good is for the memcached php, but I install it, and even if I try to reinstall it, it says that it’s already installed. So maybe the problem come from there?
It tells that “Missing php extension: memcached”. The problem is either on central server or poller node.

I do not find my answers there, so I create this thread.

Regards,

Thr0yr

Edit : Okay, it seems that someone on our departement changed the cron job… right now, everything is great! In fact, the poller did a polling each 5 min, but the central did icmp check each minutes…

Hi @guillaume-minesales

Not sure if this is fixed already, referencing this post:

That may be the cause for it, if you have every poller trying to ping every device it will cause these false positives. I have come accross this a few times with different installs so I worked with others to update the docs here:

NOTE : If you are using distributed pollers you can restrict a poller to a group by appending -g to the cron entry. Alternatively, you should only run ping.php on a single node.

This should fix the issue if you still see it.

Hi @garysteers

Thanks for the answer, but in fact my issue was due to two cron job (one on poller and one on central). One cron was set to 1 minute frequency and the second one to 5 minute frequency. This is why I had this false positive.