==========================================================
Component |
Version |
LibreNMS |
2f5a1742c3a9b8ed515a69f7794751f88cdb5e63 |
DB Schema |
188 |
PHP |
7.0.18 |
MySQL |
5.5.52-MariaDB |
RRDTool |
1.4.8 |
SNMP |
NET-SNMP 5.7.2 |
==========================================================
[OK] Database connection successful
[OK] Database schema correct
[FAIL] The poller has not run in the last 5 minutes, check the cron job
I keep getting Fail on The poller has not run in the last 5 minutes, check the cron job.
I know the poller is still polling every 5 min i can see it working in top and in the WebUI poller history. I doubled checked the cron job and it all looks good and working. Not sure why it keeps say fail cron job.
Not sure what the issue is.
Are you sure all your devices are finishing in 5 minutes?
Check /poll-log/ in your webui.
Do you have the same timezone in mysql as your local install.
Not sure where would i check that?
Yes the timezone is correct the database is getting it off the Host time.
MariaDB [(none)]> SHOW GLOBAL VARIABLES LIKE ‘time_zone’;
±--------------±-------+
| Variable_name | Value |
±--------------±-------+
| time_zone | SYSTEM |
±--------------±-------+
1 row in set (0.01 sec)
run select * from pollers;
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MariaDB server version for the right syntax to use near ‘run select * from pollers’ at line 1
Sorry i got it now
±—±---------------±--------------------±--------±-----------+
| id | poller_name | last_polled | devices | time_taken |
±—±---------------±--------------------±--------±-----------+
| 1 | nms.nbisd.edu
| 2017-05-02 11:33:19 | 279 | 1698 |
±—±---------------±--------------------±--------±-----------+
1 row in set (0.00 sec)
Run that query straight after validate.php and also run date
in the cli. Post the output of all
MariaDB [librenms]> select * from pollers;
±—±---------------±--------------------±--------±-----------+
| id | poller_name | last_polled | devices | time_taken |
±—±---------------±--------------------±--------±-----------+
| 1 | nms.nbisd.edu
| 2017-05-02 18:41:42 | 279 | 101 |
±—±---------------±--------------------±--------±-----------+
1 row in set (0.00 sec)
MariaDB [librenms]> select * from pollers date;
±—±---------------±--------------------±--------±-----------+
| id | poller_name | last_polled | devices | time_taken |
±—±---------------±--------------------±--------±-----------+
| 1 | nms.nbisd.edu
| 2017-05-02 18:41:42 | 279 | 101 |
±—±---------------±--------------------±--------±-----------+
1 row in set (0.00 sec)
Well it got worse this morning. the poller randomly stopped polling…not sure why.
You’ve got to have devices that are taking too long to poll, it could be random but it looks like that’s what’s happening.
Run this via ssh:
tail -2000 logs/librenms.log| grep -P 'secs$' |grep poller
Do you see any times over 100/200 seconds?
1 Like
/opt/librenms/poller.php 194 2017-05-03 10:25:38 - 1 devices polled in 5.434 secs
/opt/librenms/poller.php 254 2017-05-03 10:25:38 - 1 devices polled in 6.776 secs
/opt/librenms/poller.php 70 2017-05-03 10:25:38 - 1 devices polled in 5.616 secs
/opt/librenms/poller.php 160 2017-05-03 10:25:38 - 1 devices polled in 5.507 secs
/opt/librenms/poller.php 48 2017-05-03 10:25:38 - 1 devices polled in 3.660 secs
/opt/librenms/poller.php 133 2017-05-03 10:25:38 - 1 devices polled in 3.643 secs
/opt/librenms/poller.php 38 2017-05-03 10:25:38 - 1 devices polled in 5.974 secs
/opt/librenms/poller.php 251 2017-05-03 10:25:38 - 1 devices polled in 2.284 secs
/opt/librenms/poller.php 123 2017-05-03 10:25:38 - 1 devices polled in 3.648 secs
/opt/librenms/poller.php 174 2017-05-03 10:25:39 - 1 devices polled in 4.097 secs
/opt/librenms/poller.php 166 2017-05-03 10:25:39 - 1 devices polled in 6.534 secs
/opt/librenms/poller.php 259 2017-05-03 10:25:39 - 1 devices polled in 5.813 secs
/opt/librenms/poller.php 213 2017-05-03 10:25:39 - 1 devices polled in 6.791 secs
/opt/librenms/poller.php 56 2017-05-03 10:25:39 - 1 devices polled in 4.915 secs
On the other hand, i just removed are WLC it was massive with lots of interfaces I suspect that was causing the log poller time.
Should everything return to normal after that?
It should if that was the issue. What polled module was taking the longest amount of time for that device?
That was the issue with are HP MSM 765 it always overloaded and CPU pegged out, (its going to be replaced this summer) It has 1,350 WLAN interfaces on it and looks like that was causing the poller to take so long.
Thank you for your help.
You can enable selective port polling and disable a load of interfaces you don’t care about to improve that - vastly in some cases.