Unpolled Devices

Tags: #<Tag:0x00007fdb82cd57c0> #<Tag:0x00007fdb82cd56f8> #<Tag:0x00007fdb82cd5608> #<Tag:0x00007fdb82cd5338> #<Tag:0x00007fdb82cd51f8>

I’ve recently run into an issue with unpolled devices, I have approximately 650 devices and at any given time 250 to 300 devices are showing up in the unpolled log. The BIGGER issue is we aren’t getting any alerts from devices that aren’t being pulled. Just has a couple of vary large location lose power and we didn’t know anything about it.!

The system time and the php time zone are correct. For that matter the overall system health looks good, so I’m really confused as to why this even happening or why it started.

please run

./validate.php

Just upgraded the DB last night to version 10.5.9 along with the schema corrections.

./validate.php

====================================

Component Version
LibreNMS 21.2.0-52-g61316ce
DB Schema 2021_02_21_203415_location_add_fixed_coordinates_flag (200)
PHP 7.4.16
Python 3.6.8
MySQL 10.5.9-MariaDB
RRDTool 1.4.8
SNMP NET-SNMP 5.7.2

====================================

[OK] Composer Version: 2.0.11
[OK] Dependencies up-to-date.
[OK] Database connection successful
[OK] Database schema correct

I have approximately ~ 400+ devices that are unpolled, at the moment I thought for sure the DB upgrade would resolve the issue since it was the one outstanding thing I needed to correct based on the output from validate script.

Fri Mar 12 22:51 720/39853 “Cron [email protected] /opt/librenms/cronic /opt/librenms/poller-wrapper.py 8”
U 2 (Cron Daemon) Fri Mar 12 22:53 719/39745 “Cron [email protected] /opt/librenms/cronic /opt/librenms/poller-wrapper.py 8”
U 3 (Cron Daemon) Fri Mar 12 22:55 31/1240 “Cron [email protected] /opt/librenms/services-wrapper.py 1”
U 4 (Cron Daemon) Fri Mar 12 23:00 31/1240 “Cron [email protected] /opt/librenms/services-wrapper.py 1”
U 5 (Cron Daemon) Fri Mar 12 23:05 31/1240 “Cron [email protected] /opt/librenms/services-wrapper.py 1”
U 6 (Cron Daemon) Fri Mar 12 23:08 720/39889 “Cron [email protected] /opt/librenms/cronic /opt/librenms/poller-wrapper.py 8”
U 7 (Cron Daemon) Fri Mar 12 23:08 721/39913 “Cron [email protected] /opt/librenms/cronic /opt/librenms/poller-wrapper.py 8”
U 8 (Cron Daemon) Fri Mar 12 23:10 31/1240 “Cron [email protected] /opt/librenms/services-wrapper.py 1”
U 9 (Cron Daemon) Fri Mar 12 23:15 31/1240 “Cron [email protected] /opt/librenms/services-wrapper.py 1”
U 10 (Cron Daemon) Fri Mar 12 23:16 720/39875 “Cron [email protected] /opt/librenms/cronic /opt/librenms/poller-wrapper.py 8”
U 11 (Cron Daemon) Fri Mar 12 23:20 31/1240 “Cron [email protected] /opt/librenms/services-wrapper.py 1”
U 12 (Cron Daemon) Fri Mar 12 23:21 723/40034 “Cron [email protected] /opt/librenms/cronic /opt/librenms/poller-wrapper.py 8”
U 13 (Cron Daemon) Fri Mar 12 23:25 31/1240 “Cron [email protected] /opt/librenms/services-wrapper.py 1”
U 14 (Cron Daemon) Fri Mar 12 23:27 723/40032 “Cron [email protected] /opt/librenms/cronic /opt/librenms/poller-wrapper.py 8”
U 15 (Cron Daemon) Fri Mar 12 23:30 31/1240 “Cron [email protected] /opt/librenms/services-wrapper.py 1”
U 16 (Cron Daemon) Fri Mar 12 23:30 720/39901 “Cron [email protected] /opt/librenms/cronic /opt/librenms/poller-wrapper.py 8”
U 17 (Cron Daemon) Fri Mar 12 23:31 720/39865 “Cron [email protected] /opt/librenms/cronic /opt/librenms/poller-wrapper.py 8”
U 18 (Cron Daemon) Fri Mar 12 23:35 31/1240 “Cron [email protected] /opt/librenms/services-wrapper.py 1”
U 19 (Cron Daemon) Fri Mar 12 23:40 31/1240 “Cron [email protected] /opt/librenms/services-wrapper.py 1”
U 20 (Cron Daemon) Fri Mar 12 23:41 719/39845 “Cron [email protected] /opt/librenms/cronic /opt/librenms/poller-wrapper.py 8”
U 21 (Cron Daemon) Fri Mar 12 23:41 721/39930 “Cron [email protected] /opt/librenms/cronic /opt/librenms/poller-wrapper.py 8”
U 22 (Cron Daemon) Fri Mar 12 23:44 719/39790 “Cron [email protected] /opt/librenms/cronic /opt/librenms/poller-wrapper.py 8”
U 23 (Cron Daemon) Fri Mar 12 23:45 31/1240 “Cron [email protected] /opt/librenms/services-wrapper.py 1”
U 24 (Cron Daemon) Fri Mar 12 23:50 31/1240 “Cron [email protected] /opt/librenms/services-wrapper.py 1”
U 25 (Cron Daemon) Fri Mar 12 23:50 720/39815 “Cron [email protected] /opt/librenms/cronic /opt/librenms/poller-wrapper.py 8”
U 26 (Cron Daemon) Fri Mar 12 23:53 720/39873 “Cron [email protected] /opt/librenms/cronic /opt/librenms/poller-wrapper.py 8”
U 27 (Cron Daemon) Fri Mar 12 23:55 31/1240 “Cron [email protected] /opt/librenms/services-wrapper.py 1”
U 28 (Cron Daemon) Sat Mar 13 00:00 31/1240 “Cron [email protected] /opt/librenms/services-wrapper.py 1”
U 29 (Cron Daemon) Sat Mar 13 00:05 31/1240 “Cron [email protected] /opt/librenms/services-wrapper.py 1”
U 30 (Cron Daemon) Sat Mar 13 00:07 719/39811 “Cron [email protected] /opt/librenms/cronic /opt/librenms/poller-wrapper.py 8”
U 31 (Cron Daemon) Sat Mar 13 00:10 31/1240 “Cron [email protected] /opt/librenms/services-wrapper.py 1”
U 32 (Cron Daemon) Sat Mar 13 00:15 31/1240 “Cron [email protected] /opt/librenms/services-wrapper.py 1”
U 33 (Cron Daemon) Sat Mar 13 00:15 721/39963 “Cron [email protected] /opt/librenms/cronic /opt/librenms/poller-wrapper.py 8”
U 34 (Cron Daemon) Sat Mar 13 00:18 721/39954 “Cron [email protected] /opt/librenms/cronic /opt/librenms/poller-wrapper.py 8”
U 35 (Cron Daemon) Sat Mar 13 00:20 31/1240 “Cron [email protected] /opt/librenms/services-wrapper.py 1”
U 36 (Cron Daemon) Sat Mar 13 00:21 723/40046 “Cron [email protected] /opt/librenms/cronic /opt/librenms/poller-wrapper.py 8”
U 37 (Cron Daemon) Sat Mar 13 00:25 31/1240 “Cron [email protected] /opt/librenms/services-wrapper.py 1”
U 38 (Cron Daemon) Sat Mar 13 00:26 722/39944 “Cron [email protected] /opt/librenms/cronic /opt/librenms/poller-wrapper.py 8”
U 39 (Cron Daemon) Sat Mar 13 00:26 721/39925 “Cron [email protected] /opt/librenms/cronic /opt/librenms/poller-wrapper.py 8”

Lots of breaks in my graphs as well… This has been going to for at least 3 weeks or more.

Look at logs/librenms.log for poller errors and poller time per device

This is a good question. Is there a way to trigger an alert if a device was not polled within let’s say the last 15 min?
How would one configure an alert rule for that?