One of the nice stuff about nagios is to be able to set host dependencies. Say you have a site-to-site vpn to you branch office having several devices behind it. If the vpn goes down, you get dozens of notifications. So if librenms knows that the main device goes down I’d just get the notification for the vpn router not for the others.
We already have this on the list: https://github.com/librenms/librenms/issues/3293 but no plans yet to implement it.
This would be very helpful. I’ve found in certain situations, getting alerted for 10 hosts because of 1 device can be difficult.
It would be nice… But somebody has to code it up
I would like to request this as well and see if we could garner support to raise it s priority in the dev cycle.
The ciritical driver these days is VMs. Dozens or more VMs on a single host and hundreds on a single chassis, not to mention the virtualization of networking, could make living without dependency management highly impractical.
What can we do to raise the priority of this issue?
This is a big need for my org. When a circuit goes down we get hundreds of devices alerting when we really only want to see the alerts for the circuit itself. Who can we send some money to in order to get implementation going?
Service Check - Nagios plugin
I just wanted to share about how I’ve dealt with this issue. We have multiple locations and we have actually deployed NMS on each location. However, we only 1 device added to a remote location. This way when something does happen, you only get 1 alert but at least the remote devices keep loggin/reporting their local date.
Now thats a rather expensive approach but something that works for me.
Seeing the responses on this thread I felt obliged to create a PR for this feature:
Anyway, It would be great if you guys can test it, and let me know if you find any bugs, or want something to be changed (logic, icons, gui, you name it).
@aldermir_a Posting here upon request instead of PR comments.
Some issues we’ve seen trying to implement this in a production setup…
Multi layer is not working. If we have a switch (switchA) that has a dependency on a router in LibreNMS, then if we try to add switchB, switchC as a child dependency on switchA, it does not work in our testing. SwitchB and SwitchC will alert as down and not have a “skipped due to parent down”.
Alert templates. It’d be a nice feature to include in a host down/up template the child devices that are affected by the downed parent.
The way to mass associate childs to parents in the WebUI is hard when you have alot of devices. When you clear the parent in the model it should then remove all the other entries in the childs box. It is very time consuming to click delete on each one of those.
Thanks for your work thus far on this though.
^ This, definitely this. And I’ll echo the thanks on the work so far implementing the feature.
To delete an association, you can select None from the parent and then add the children to be cleared. I’ve also sent a new PR to delete all the children of selected parents. Basically this adds another tab to “Manage host dependencies” modal.
For multi layer, yeah what we do is to check if there’s an association and if the parent is down just skip the alerts, I feel you, I think this feature should be multi layer proof, but we need to make it so it doesn’t happen to have a performance impact or infinite loop or something. Open to suggestions or PRs.