Can I use LibreNMS for more than 20 000 equipments ? How instances work?

Hello everyone,

My project is to create a monitoring tool for more than 20 000 equipments. I would also create different users, and each user will have a different view of monitoring with different equipment out of of these 20 000 equipments. For that, I would like to put a VM in the cloud, and to create a docker on this VM. My question is, how the one or more instances work ? Because I think that it’s difficult for only one docker to do all.

Thanks in advance !

check out - Scaling LibreNMS - LibreNMS Docs

So, If I understand, I must create 1 poller per group of equipments ?

You create as many pollers as needed to get through the devices you have. I’m not sure Docker on top of a VM will be very efficient for scaling if I’m honest. We used to run our 4 poller setup in Openstack and it struggled, physical tin worked really well.

You have two questions here, 1. Will LibreNMS scale to 20k devices? Yes it will, 2. How can I scale it. With distributed polling and the right setup, however that part is very difficult to answer without knowing what devices you will be polling and may be something you just have to spin a single instance up, add a load of devices and see how far you get before adding more resources.

1 Like

Thanks a lot for your response !
So, I have another questions now. These 20 000 equipments are distributed in different space (approximatively 40) in the country. So, I would like to know if I can put one poller by space to simplify a lot the architecture (and also because I will have not a lot of ressource on this docker/VM) ?
Another question, to have the best performance, do you suggest me to have a VM on a camputer by place with the poller running on ? I will polling different kind of equipment like printer, security camera, Wifi access point …
A last question, less on architecture, I would like to create for each space a device group with all the equipments of this space with one poller (and so a poller group). After that I would create different user, and each user must have access only to his space. Is it feasible ?

Thanks in advance for your time !

You can indeed put one poller in each of the locations to just poll the devices close to it (poller groups are your friend here). However be careful of latency back to the RRDCached and MySQL servers which have to be central, latency can kill performance.

I’m worried about your comment on not having a lot of resources for the VM/docker, what do you mean by that + how many devices on average will you be polling in each location?

So for users, yes, you can create device groups per poller group and assign those device groups to a user. That’s currently beta though as an FYI

How can I improve the infrastructure to have the least possible latency between my pollers and the central server with the RRDCached and MySQL servers?

On each location I will have between 100 to 500 equipments to monitor. For you, the easir configuration is to have a the poller directly running on the VM or to use Docker? And with what configuration in terms of RAM / ROM ?

You should pick VM or Docker based on your ability to support either. Docker will give you an easier upgrade path longer term.

Latency is latency, you can’t change it when it’s based on distance.

The latency couldn’t be improve even if I use dark fiber ?

Best practice is to have everything(rrdcached, pollers, mysql, dns, etc) in one place. From there you can poll your devices.

  1. You need to use remote pollers if some devices are not reachable directly. (one reason can be NAT)
  2. There are some slow devices (normally very old switches) that needs a lot of time to be polled. If you’re doing 5 minutes polling (standard), make sure that’s possible. You can do some tweaks for some devices.
  3. DNS, make sure is very close to your LibreNMS servers.
  4. I would skip docker, you will need a lot of resources. It also depends if you have switches with 48 ports and a lot of other sensors or just servers with few metrics.

I’m currently polling close to 3k devices (mostly switches) with 6-7 pollers with a load of 20-30% per poller. Our devices are all over the world, from China, Europe, South Africa, US, Brazil and Mexico.

We did a small test, putting a remote polling in Miami and we discover that is slower than polling directly from Frankfurt (where our LibreNMS is located).

If you have other questions, i’m here to answer you :slight_smile:

3 Likes

I second this. Had 9 pollers, 4 in US 5 in Europe. Moved all polling to Europe where I also have my main server. Reduced my total polling time from 130k seconds to 80k seconds. Removed the 4 US pollers and added 2 Europe pollers.