We added about 500 devices yesterday and probably will add no more, but the CPU is hitting 100% now, what can we do? I can up the CPUs to 10 or 12 I guess, but can you run an additional poller?
We did have Solarwinds and had a 2nd poller that simply reported back to the main Solarwinds server which had the DB.
You are hitting the limit. You need to configure distributed pollers for this.
For example we monitor 2.5k devices and we have 10 pollers + mysql.
Every pollers has minimum 8cpu and 8gb of ram. https://docs.librenms.org/Extensions/Distributed-Poller/
You can clone the vm and reconfigure them as pollers.
Just note that a poller doesn’t need a mysql server installed but it needs one where to connected to.
Did you run the mysqltuner and do any of the other tuning recommendations like adjusting polling threads. When I had a single instance, I was hitting 500 seconds for polling. After using the mysql tuner script and doing the recommended changes, I dropped my poller time in half.
You can also do two full clones and implement a three node galera cluster with redis and rrdcached to get a little fault tolerance.
Since I transitioned to a 5 node poller setup, I have doubled the amount of devices I am polling but can now poll around 2k devices with the longest poller taking 58 seconds to complete.
For the mysqltuner you are supposed to have the database running for at least 24 hours. Copy the text into an empty file and name it whatever. Then chmod +x for the file and run it.
I had a similar issue. I had a bunch of raspberry pi 3b+ available after they were decommissioned from a failed project. I set one up as a poller and then cloned the sd card. Changed the host name and the .env file has a unique id for each pi, then created poller groups and then assigned devices to these. Its been working a treat for the last 6 months