Upgrade broke poller

I followed the procedure to update librenms with the below commands yesterday and it broke the poller.

git pull
./daily.sh

Before the upgrade the polling usually went under 120 seconds, now the polling time is between 6000-12000 seconds.


 ./validate.php

====================================

Component Version
LibreNMS 1.60
DB Schema 2019_12_28_180000_add_overwrite_ip_to_devices (156)
PHP 7.2.17-1+ubuntu16.04.1+deb.sury.org+3
MySQL 10.0.38-MariaDB-0ubuntu0.16.04.1
RRDTool 1.5.5
SNMP NET-SNMP 5.7.3
====================================

[OK] Composer Version: 1.9.3
[OK] Dependencies up-to-date.
[OK] Database connection successful
[OK] Database schema correct
[FAIL] The poller (M01UMESMG0066) has not completed within the last 5 minutes, check the cron job.
[WARN] Some devices have not been polled in the last 5 minutes. You may have performance issues.
[FIX]:
Check your poll log and see: Performance - LibreNMS Docs
Devices:
m01umesmg0066.m01.myatea.net
mal-d1-r01.m01.myatea.net
mal-d1-r02.m01.myatea.net
ns3.m01.myatea.net
ume-d2-ds02.m01.myatea.net
ume-d2-ds01.m01.myatea.net
ume-d2-fc02.m01.myatea.net
ume-d2-ro02.m01.myatea.net
ume-d2-oob-sw01.m01.myatea.net
ume-d1-ro01.m01.myatea.net
mal-d1-oob-sw01.m01.myatea.net
ume-d1-fc01.m01.myatea.net
ume-d1-fc02.m01.myatea.net
ume-d1-oob-ca07.m01.myatea.net
ume-d2-ro01.m01.myatea.net
and 54 more…
[FAIL] Some devices have not completed their polling run in 5 minutes, this will create gaps in data.
[FIX]:
Check your poll log and see: Performance - LibreNMS Docs
Devices:
m01umesmg0066.m01.myatea.net
mal-d1-r01.m01.myatea.net
mal-d1-r02.m01.myatea.net
ns3.m01.myatea.net
mal-d1-sw01.m01.myatea.net
ume-d2-ds02.m01.myatea.net
ume-d2-fc01.m01.myatea.net
ume-d2-ds01.m01.myatea.net
ume-d2-fc02.m01.myatea.net
ume-d2-ro02.m01.myatea.net
ume-d2-oob-sw01.m01.myatea.net
ume-d1-ro01.m01.myatea.net
mal-d1-oob-sw01.m01.myatea.net
ume-d1-fc01.m01.myatea.net
ume-d1-fc02.m01.myatea.net
and 71 more…
[WARN] Your local git contains modified files, this could prevent automatic updates.
[FIX]:
You can fix this with ./scripts/github-remove
Modified Files:
bootstrap/cache/.gitignore
logs/.gitignore
rrd/.gitignore
storage/app/.gitignore
storage/app/public/.gitignore
storage/debugbar/.gitignore
storage/framework/cache/.gitignore
storage/framework/cache/data/.gitignore
storage/framework/sessions/.gitignore
storage/framework/testing/.gitignore
storage/framework/views/.gitignore
storage/logs/.gitignore

What do you see in the performance graphs? Is there a specific module that takes a lot of polling time?

Do you monitor cisco ios devices?

http://yourlibre/pollers/tab=performance

Hello,

It’s mostly Cisco devices and the performance tab shows only the attached information so not sure what to look for.


What version did you upgrade from and to?

I can’t remember the old version, usually the daily script updates the system for us but when I logged in here on the forum this Monday it said that we needed to the the “git pull” and then the “./daily”

Do you have the output of that at all from your shell?

Are you running cron jobs or using the service workers? How many devices?

Afraid that I don’t have any shell information, except the “history command” but doubt that would be helpful, 98 devices are being polled and we’re using cron job.

Cron:
33 */6 * * * librenms /opt/librenms/cronic /opt/librenms/discovery-wrapper.py 1
*/10 * * * * librenms /opt/librenms/discovery.php -h new >> /dev/null 2>&1
*/5 * * * * librenms /opt/librenms/cronic /opt/librenms/poller-wrapper.py 16
15 0 * * * librenms /opt/librenms/daily.sh >> /dev/null 2>&1
* * * * * librenms /opt/librenms/alerts.php >> /dev/null 2>&1
*/5 * * * * librenms /opt/librenms/poll-billing.php >> /dev/null 2>&1
01 * * * * librenms /opt/librenms/billing-calculate.php >> /dev/null 2>&1
*/5 * * * * librenms /opt/librenms/check-services.php >> /dev/null 2>&1

*/2 * * * * librenms /opt/librenms/ping.php >> /dev/null 2>&1

*/5 * * * * librenms /opt/librenms/services-wrapper.py >> /dev/null 2>&1

Check if you have any old poller.php processes running > 10 mins. If so, kill them all.

Had a ton of poller.php processes that were over 10 minutes and I killed them all, they started afresh but are building up again so still having the same problem :frowning:

Yesterday and this morning I also rebooted the server.

After running daily this morning it looks like the polling time has gone down to 68 seconds again :slight_smile:

Will monitor the poller during the day, thanks everyone for the help.

1 Like