I recently installed a new server on ubuntu 20.04 with uptodate librenms server and migrated data from a previous server to this one. I Managed to get everything working as it was before except for few servers that are having multiple issues:
These linux based servers are not detected properly. They have 4x40Gbps Adaptors that are detected each as 4.3Gbps for some reason as can be seen in below screenshot
I have tried manually changing speeds to correct value but next polling the values are returned back to 4.3Gbps and wrong traffic reported for the server.
Sometimes out of nowhere the servers suddenly have no ports associated to them and also CPU and memory graphs stop working altogether !
I have tried to remove these devices completely via GUI and also from the DB and re-adding them but the behaviour stays the same.
Sometimes they are reported with 0 ports, no CPU or Memory graphs/info and sometimes the ports are detected but with wrong speed and wrong traffic data ,CPU and memory working for example.
So it is very strange the behaviour of these servers. They are the only devices we have with “mellanox technologies mt27700 family connectx-4” adaptors so I thought maybe there is issue with newest librenms built and these devices ?
I did try to add these devices on another librenms server built seen below and they actually started by behaving properly with all interfaces detected correctly at 40Gbps and memory,CPU and total traffic graphs all working as expected but after 30 mins or so it shows same behavour as my librenms server.
We are monitoring these servers via cacti as well and that is continuing to work as expected…
Would appreciate any kind of help in resolving these issues. Thank you in advance !