Basically, while I get the idea of wanting as many things as possible to be done without user interaction, the idea that I have to use a different service to monitor a subset of my infrastructure would become a major downside to LibreNMS compared to other products.
So, you just want to set up Nagios for them without Nagios?
But, why should I have to run LibreNMS AND Nagios in order to monitor all of my infrastructure? Why canât I just do it all within LibreNMS? Otherwise, people would find a product that does it all, instead of having to run multiple products at the same time.
Thatâs fine just trying to clarify and I was being a bit cheeky.
Fair enough
I get what you are saying, but I have hundreds of devices that will be auto-discovered and setup, and tens of services that canât be monitored by ICMP or SNMP. Iâd love to just be able to do it all within LibreNMS and not have to sit through 20 âdownâ servers when trying to find devices that really are down. And if we create separate OSes rather than just a âpingâ OS, in the future, we could even have an instance ID as an alternative to a hostname when adding a host, if AWS integration is enabled, which would be in keeping with the core principles, no?
Wouldnât it be an option for you to add the service checks to another device? E.g. the server that uses the database, also has the nagios checks for MySQL.
For me itâs not having multiple icmp only OSâ, itâs just using the normal OSâ but allowing people to add them without having snmp enabled. Why would you want to do that? Personally no idea, why would you want to add any non-snmp devices to an snmp centric system but people are asking for it so if we do allow non-snmp, donât restrict it to a single OS called ping, allow users to still pick an OS so the usual settings from the os definition show.
To confirm, Iâm not actually overly bothered about this option, itâs of no use or benefit to me but I can already envisage that if you do a ping only OS, you will instantly get people asking to set the logo and other text options - at that point just selecting the correct OS fixes all of that.
Not really.
- It will get confusing if you have one server that has 4+ different MySQL checks
- If i get an alert email on the mysql checking monitoring DB1, but the alert email says âgeneric-placeholder-server has a mysql service downâ and i have to go track down whatâs actually down
- See my reply above. How will you attach, say, 10 instances worth of cloudwatch data to a single device? Even if possible, thatâs just WAY to cluttered.
All-in-all, itâs much better to just add a seperate âdeviceâ for each instance. Not only to get clean alerts, and easily see the status of specific servers or service, but to be future-proof for new features to LibreNMS, like integrating cloudwatch (or something else) into LibreNMS, it will be better to do it the âright wayâ now, rather than half-ass it and have to redo it later.
@laf - as to why I would want to. Say that i have 500 devices that ARE monitored by this SNMP centric system, and has all my alerting setup, and everything just the way I want it, with AD integration, etc. Then I have 10 devices that are production, mission critical, but no SNMP or ICMP. Wouldnât it make sense to want to just add them to your existing monitoring service, instead of architect, design, test, stage, present, demo, and then move to production, a SECOND monitoring system, that will also cost more money, just to monitor those 10 servers? Why not just add them to your existing monitoring system instead?
I can still see this happening with high frequency:
I added a device, and set the OS, where are all my graphs?
Well, Ideally itâd be an âopt-inâ sort of thing, not an opt-out. I.E. youâd have to go out of your way to both change the OS and to disable SNMP, and we can even have some sort of warning if you want. âWarning: This feature may have unintended side-effects, may not graph what you want, yadda yaddaâ.
Well but LibreNMS is an SNMP monitoring system and thatâs what we all use it for. If you want to integrate a proprietary (Is cloudwatch an amzon product? Iâve never heard of it.) protocol, then it would mean a pretty big re-write. Nothing is stopping you from doing it - LibreNMS is opensource, but still it deviates quite a bit from what it does right now.
Feels a bit like wanting to have your microwave be able to cut potatoes all of a sudden.
I beg to differ. Many people (like me) use LibreNMS as their main monitoring platform. And I donât enjoy adding another program to monitor my infrastructure. So while not ideal, we should introduce, wmi support, ping only devices etc. SNMP is the main thing for LibreNMS and which is fine, but in the real world, not every device uses snmp and we require a good program like LibreNMS to support these oddballs.
Maybe ping only devices should have their own submenu or something I donât know. Best way is to discuss here and try to come out with the best solution.
Just my two cents
Corner case here but with the new âconnected worldâ this is a growing area. I look after all of the technology at a mine and while ensuring that the phones and internet work at locations dotted over 200 km² is my primary concern, I also need to monitor all of the other infrastructure (weather stations, plcâs, treatment plants, security cameras, fuel cardlocks, etc) for network connectivity. A majority of these do not have SNMP (or it is prohibitively hard to set up) but they do respond to pings. Having one monitoring platform would allow me to quickly and easily identify the locations of either communication or power failures because most of these remote sites only alarm via email which is only useful if they can communicate.
I have done some of this by using services but having each one as itâs own device would make troubleshooting much easier.
You can say that about anything we provide
Although I donât actually think we will, people can force add devices with invalid snmp creds and we rarely see questions about that.
Right, but moreover, itâs a MONITORING platform. It seems to be, we aim to compete in the same space as nagios, observium, zabbix, etc. In a nice, idealistic world, Iâd love to have an environment where everything is hardware in my own datacenter, and everything has SNMP, and itâs all monitored at the push of a button. If I was the CTO/CIO of the company I work for, I might push for that. But as it stands, Iâm the DevOps Engineer in charge of infrastructure that was handed to me, and trying to move a cloud product with 90% of the infrastructure in AWS OUT of it, is pushing a boulder uphill. I canât just deploy whatever I want, nor how many. I canât just setup a nagios/zabbix/whatever else BESIDE LibreNMS because itâs not that simple.
Itâs a production system, supporting millions of users, and I doubt that Iâm the only one with such a use case. The reality is, you pick one system that does what you need it to. If LibreNMS did most of it, but not all, the response I would get is not âwell, letâs just run LibreNMS AND nagios. Oh, also pay for X monitoring platform tooâ. Instead, we would pick the product that can handle all of our needs.
The smart thing to do is to make sure that our product will be the one that people pick because it can handle these things, otherwise other people in my situation will choose a different product entirely. Saying âWell, you can just spend another 6 months to create a new Proof-of-concept, learn how to eliminate all single points of failure to handle n
servers down for 24/7 fault tolerant monitoring, architect and design how to deploy it into the existing infrastructure, create all the pretty powerpoints and presentations to appease management, document everything, deploy to staging, get buy-in from everyone, and finally deploy to productionâ - this isnât a realistic solution.
Yeah. Itâs amazonâs monitoring platform for their own services, but fairly complicated and not terribly simple, doesnât support non-aws services, etc. Best practice is to ingest itâs data into your own monitoring system to use it along side your monitoring of everything else.
Yeah, exactly. Having a device that just displays as âupâ, but has no ICMP, or SNMP, attach service checks to it, and then get service check alerts as normal is the easier, and seemingly best, way to move forward with this.
I donât think we need to debate whether we should support more than snmp, long term we should but we are obviously not in a position to do so right now.
sure, but that PR is a good step forward, and something I can build on for more, is what Iâm trying to say. So if we make sure that it is done in a way that support future code additions, I can built on it
The thing is some choices are practically permanent because of the amount of effort it takes to change them without breaking everything.
The main people that pay that price are the maintainers. That is why laf and I are resistant to a quick fix.
The solution needs to be thought out and this usually involves multiple code and ui iterations.