Hi everyone, I’ve been searching but I can’t find much in the way of other people doing this. I need to build a LibreNMS environment with multiple poller nodes, but for redundancy, I’d like to run the web UI on at least 2 of those nodes.
From what I can see, there’s no reason why that isn’t possible, other than putting the database, Redis, memcached and rrdchached etc in a centralised location accessible to all nodes and having the RRD storage available over NFS in case of devices being renamed.
Are there any other best practices or limitation I should be aware of? E.g.:
Are there potential issues if a user makes changes in one web UI and another makes changes on the other?
Will it be possible to user either of them interchangeably or is there a risk that something gets out of sync? With the DB etc accessible to both I’d think not.
Are there failure scenarios to be aware of?
I’d like some advice from anyone who’s done it successfully before.
I am currently doing this on all related services. I am N+2 .
Using Mariadb Gallera Cluster Managed by MaxScale
Seperate containers x3 of RRD,Redis, Redis Rep and they are in a setnial group, 3 webservers. They all are accecessed by HAProxy. There are 3 HAproxy containers for each service. All HAProxys are using frr for sync on a anycast address, bgp, and isis. Then I have 9 baremetal pollers that are beasts, All my configs point to the anycast address of the HAProxy of the service needed. I have to say it works very well and fast. Also if you have not tired MaxScale from MariaDB I highly reccommend it. Let me know if you would like more info.
Wow, that’s quite the setup. How many devices are you monitoring?
Have you found any issues with accessing and presumably making changes on more than one web node? Have you experienced any failures, if so, was there any split brain to clear up?
I havent seen any split yet, I use redis on the back end and then they are all pointed at the same ip for the db which is MaxScale. I also have it set where the first 50 connections to the web server have to go to the Prime WebServer but if its resource load is above 80% then it sends to the others. There is over 1500 devices right now.