Poller performance regression in 1.59

====================================

Component Version
LibreNMS 1.59-23-g8d28e40
DB Schema 2020_01_09_1300_migrate_devices_attribs_table (153)
PHP 7.2.1
MySQL 10.2.12-MariaDB
RRDTool 1.7.2 SNMP

We recently upgraded our instance from a rather old version (1.52 I believe) to 1.59 and experienced rather dramatic slowdown in poller performance:

No changes were made to other components of the platform - we use librenms service poller.
Is that anything obvious you could think of in the new release that would cause such dramatic poller slowdown? It looks like all this extra time comes from ‘ports’ component…

1 Like

Unfortunately I see also degrading performance around the change of the year, no idea of the cause:

image

@Elias please send a zoom of the graph so that we can know what are the date in month

Also output of
./validate.php

@nktl All module polling time have increased. If you are running out of disk IO, a change in a module might have impacted all modules.

However, we should investigate what is the cause of this huge difference

Maybe check your max MySQL connections

I reverted to the ‘legacy’ poller-service.py service (from librenms-service.py) and it performs much better.

I have actually realized I raised something similar ages ago:

It looks like the problem is still there to some degree - and performance of librenms-service.py got somehow worse in recent releases.

Seems like this thing is getting confused by Nexus routers as well a bunch of UCS 6300 FIs - disabling them brings polling back to acceptable levels.

This project could really use high-performance poller with in-flight window tuning on per-device basis, like the AKIPS one (https://www.akips.com/showdoc/blog2) - I would have contributed the code if I had any idea how to implement that :slight_smile:

Tested from services v2 to crontab, performances are worse.
Don’t want to test poller-service.py - v1 services - because a bug can DoS equipments due mysql/mariadb.

May be playing with numbers of worker could help also

1 Like

I don’t know if this is related to the issue the OP is having with the poller, but for me the poller performance degraded at around April 2019.
This can be traced back to the 1.50 or 1.50.1 release.


Just upgraded at around 12:00 from 1.59 with https://github.com/librenms/librenms/pull/10792 optimisation to 1.60.

Performances are now very bad. Anyone facing the same issue ?

Which module is taking the longest?

Difficult to say

Prod1:

Prod2:

I am debugging which commit causes the performance issue

Perfs are better after downgrading to 1.59 but it did not go back to previous values

Tried also to revert to database backup from just before the upgrade. Not better

How to downgrade composer ?