Hi All,
We have moved from a mix of Nagios and Observium at work to LibreNMS and after setting it up at 6 sites plus a master (all bar the master are behind NAT and have no VPNs between them for operational reasons) and going through the setup of distributed polling, which I have found to be not so reliable in our setup. I started to wonder if there might be a cleaner way.
Currently we have a number of devices that can be seen via the existing master and these can continue to be polled by the master in the current fashion. This is mainly looking at the remote sites, however I would envisage this allowing polling of local devices to be offloaded to another server(s) as if it was a remote site.
These are my thoughts on a possible replacement and I welcome any suggestions or criticisms on them.
- Add a method of creating pollers with some sort of unique per poller token.
Devices
Add an endpoint on the API something like /poller/devices
This will produce all of the JSON required for a poller to poll the devices assigned to it.
The poller would then pass the poll data back to an endpoint like /devices/:device_id/result
Where it would be parsed and logged in LibreNMS like a normal poll would be.
Services
Services would operate similarly by using an endpoint something like /poller/services
Where it would use that data to run the checks against against the relevant device using the Nagios plugins installed on the distributed poller.
This data would then be returned via an endpoint similar to /services/:service_id/result
With the relevant alerts setup to point out any pollers/devices/services that have not been reported on in a period.
This is a very rough process, but I feel it would allow expandability and allow users to potentially write their own pollers or come up with ingenious methods of using these endpoints.
I would also like to investigate a way to set services as “passive” meaning they will be skipped by the existing polling methods, this would allow services that want to report their status back to LibreNMS to do so via the above API. For instance backup jobs completing or similar.
After some discussion I would be looking to develop these API endpoints as well as the pollers.
Many Thanks
Dub