Distributed Poller Issue - Same device polled by multiple pollers

andro · 21 May 2020 21:38

Hello -

During one of the recent updates, it seems my distributed pollers started polling the same device multiple times. Said differently, we monitor about 1,300 devices, but each poller is trying to hit each of the 1,300.

Back a few months ago and the system seemed to automatically divy up the workload. As we added devices, we deployed a new pollers with a completely standard config, all devices in distributed_poller_group zero, and it worked flawless. Trying to balance our load manually is going to be a headache and a half.

I believe that I have checked all the obvious things (rrdcached and memcached are accessible from each of the pollers), current/matching codebase on all the pollers and main server and a matching app_key on all the pollers and the main server. The app_key was new to me and I had hoped that was the fix, but… alas it was not.

Any thoughts on why the behavior changed and how to revert back to the old behavior?

Thank you for any guidance!

All systems have the same validate.php output:

Component	Version
LibreNMS	1.63-106-g370c7f5
DB Schema	2020_04_13_150500_add_last_error_fields_to_bgp_peers (164)
PHP	7.3.18
Python	3.6.8
MySQL	5.5.60-MariaDB
RRDTool	1.7.0
SNMP	NET-SNMP 5.7.2
====================================

[OK] Composer Version: 1.10.6
[OK] Dependencies up-to-date.
[OK] Database connection successful
[OK] Database schema correct
[WARN] Your install is over 24 hours out of date, last update: Wed, 20 May 2020 13:25:27 +0000

Screenshot of all the pollers trying to poll each device:

Screen Shot 2020-05-21 at 1.45.06 PM

Craig · 21 May 2020 22:04

Hi Andro,

I had the same issue and found that the pollers are not using memcached anymore. I gave up trying to figure out the problem and moved to the https://docs.librenms.org/Extensions/Dispatcher-Service/.

Craig