I’m a relatively new (and very impressed) user of LibreNMS and have spent about a week setting it up and fine tuning it on our network, and I’ve run into a strange issue.
Three of our 36 switches (mostly D-Link) have been “crashing” when polled by LibreNMS. When I say crashing, they continue to switch layer 2 network traffic normally, however they totally stop responding to pings, SNMP queries and the web interface. Basically anything other than layer 2 switching stops working. Although I haven’t verified it I think features like LLDP also stop working.
Once this happens a hard power cycle of the device is required to get it responding again, and so far I haven’t found a configuration for the switch that avoids this problem.
Two of the switches are the same model - D-Link DGS-1210-24, and one is a much older one - DGS-1224T. All are running the latest firmware, which unfortunately is very old now. (Strangely we have other DGS-1224T’s which are not crashing, presumably they are on different firmware versions, I haven’t checked)
After a while I realised it is not the regular 5 minute SNMP polls which are crashing them, but the 6 hourly Autodiscover. I realised this when the crashes where 6 hourly and the “last discovered” time was 5 minutes before the last heard from time…
So as a workaround I’ve individually excluded them from being re-discovered with lines such as:
$config[‘autodiscovery’][‘nets-exclude’] = ‘10.0.2.19/32’;
Has anyone else seen Autodiscover crashing the management interface on switches ?
Is there any way to tell LibreNMS to automatically exclude already discovered devices from “re-discovery” ? Is there any advantage to already known devices being rediscovered periodically ?
For example, if I replace a switch with a newer model we would normally configure its management IP to be the same as the one it replaced, would LibreNMS pick up all the changes (number of ports, model number etc) during a 5 minute polling or would a full re-discover be necessary to update that info ?
(Of course I could manually tell it to rediscover a device after I have replaced it, I don’t need to rely on periodic auto-rediscover)