Librenms-service v2 : do not perform polling and discovering at the time

louis · 25 October 2019 15:58

Hello,

I noticed that polling and discovering at the same time causes CPU peak on some devices (ex: Juniper QFX5100 and QFX5110).

Oct 25 02:06:29 librenms-prod1 librenms-service.py[2962]: Poller_0-11(INFO):Polling device 91

Oct 25 02:06:29 librenms-prod1 librenms-service.py[2962]: Discovery_0-8(INFO):Discovering device 91

Oct 25 02:06:43 librenms-prod1 librenms-service.py[2962]: Poller_0-11(INFO):Completed poller run for 91 in 14.85s

Oct 25 02:07:09 librenms-prod1 librenms-service.py[2962]: Discovery_0-8(INFO):Completed discovery run for 91 in 40.74s

In Librenms/service.py, I suggest to wait for polling to end before proceeding with polling.

                for device in devices:
                    device_id = device[0]
                    group = device[1]

                    if device[2]:  # polling
                        self.dispatch_immediate_polling(device_id, group)

                    if device[3]:  # discovery
                        self.dispatch_immediate_discovery(device_id, group)

murrant · 25 October 2019 23:18

Discovery and polling overlap has always existed within LibreNMS. If have them wait for each other, you either A. won’t ever run discovery B. will miss one or more poller intervals.

Also, some device discovery is significantly longer than the poller interval…

CPU usage being higher is expected and shouldn’t have any affect on device throughput unless you already have an issue.

louis · 28 October 2019 10:06

@murrant
Thank you for your response.

We put this parameter to avoid CPU peak (by security) :

$config[‘os’][‘junos’][‘snmp_max_oid’] = 1;

murrant · 28 October 2019 21:34

Sounds like you may be running into sampling bias…