I have an interesting problem. I have a three node librenms distributed poller setup with dispatcher service, and the nodes pointing to a galera database backend. The latter could be relevant… We have a functioning redis to keep it all working as per the distributed poller setup, validate.php says things are great and everything seems to be great. We have different groups and different devices are only being polled by designated pollers etc, we know its resilient and we can rely on it. We are not seeing any negative logging indicating lock issues or anything that could perhaps be redis related, nor with the database (to date).
The problem appears to be with services. We have only just recently enabled the services functionality in the nav bar and specified the path to the nagios plugins, and on the face of it when adding a service for a device, e.g. a simple curl, the eventually wakes up and goes green. If we stop the service being checked, it goes red. So we know the service is actually being actively polled and corss verified with a packet dump. It even matched alert rules. The issue that under the “details” for the service I see empty graphs.
I have a very similar replica environment that is just one poller at the moment, I havent quite got to the stage of adding in a second poller to perform some more testing. that environment also has a galera backend and has distributed polling enabled, its just not used as its on its own. Services added to this stack work just like in our main environment, but they DO have populating graphs.
I have troubleshooted this quite a bit today but I can’t put my finger on what the issue might be. I’ve checked its not a trivial rrdcached issue, other “regular” stats via snmp are being graphed with no issue. Its just the services.
If we attempt to run the service-wrapper.py manually (bearing in mind we are using dispatcher, I would have though as this mention services we dont need a cron), this performs a poll and also updates the graphs.
Reaching out the community if there is any pointers anyone may be able to offer to troubleshoot this.