Disable SNMP check

Gorian · 16 September 2017 04:37

Basically, while I get the idea of wanting as many things as possible to be done without user interaction, the idea that I have to use a different service to monitor a subset of my infrastructure would become a major downside to LibreNMS compared to other products.

murrant · 16 September 2017 04:38

So, you just want to set up Nagios for them without Nagios?

Gorian · 16 September 2017 04:39

But, why should I have to run LibreNMS AND Nagios in order to monitor all of my infrastructure? Why can’t I just do it all within LibreNMS? Otherwise, people would find a product that does it all, instead of having to run multiple products at the same time.

murrant · 16 September 2017 04:40

That’s fine just trying to clarify and I was being a bit cheeky.

Gorian · 16 September 2017 04:43

Fair enough

I get what you are saying, but I have hundreds of devices that will be auto-discovered and setup, and tens of services that can’t be monitored by ICMP or SNMP. I’d love to just be able to do it all within LibreNMS and not have to sit through 20 “down” servers when trying to find devices that really are down. And if we create separate OSes rather than just a “ping” OS, in the future, we could even have an instance ID as an alternative to a hostname when adding a host, if AWS integration is enabled, which would be in keeping with the core principles, no?

florianbeer · 16 September 2017 08:04

Wouldn’t it be an option for you to add the service checks to another device? E.g. the server that uses the database, also has the nagios checks for MySQL.

laf · 16 September 2017 20:58

For me it’s not having multiple icmp only OS’, it’s just using the normal OS’ but allowing people to add them without having snmp enabled. Why would you want to do that? Personally no idea, why would you want to add any non-snmp devices to an snmp centric system but people are asking for it so if we do allow non-snmp, don’t restrict it to a single OS called ping, allow users to still pick an OS so the usual settings from the os definition show.

laf · 16 September 2017 20:59

To confirm, I’m not actually overly bothered about this option, it’s of no use or benefit to me but I can already envisage that if you do a ping only OS, you will instantly get people asking to set the logo and other text options - at that point just selecting the correct OS fixes all of that.

Gorian · 16 September 2017 21:13

Not really.

It will get confusing if you have one server that has 4+ different MySQL checks
If i get an alert email on the mysql checking monitoring DB1, but the alert email says “generic-placeholder-server has a mysql service down” and i have to go track down what’s actually down
See my reply above. How will you attach, say, 10 instances worth of cloudwatch data to a single device? Even if possible, that’s just WAY to cluttered.

All-in-all, it’s much better to just add a seperate “device” for each instance. Not only to get clean alerts, and easily see the status of specific servers or service, but to be future-proof for new features to LibreNMS, like integrating cloudwatch (or something else) into LibreNMS, it will be better to do it the “right way” now, rather than half-ass it and have to redo it later.

@laf - as to why I would want to. Say that i have 500 devices that ARE monitored by this SNMP centric system, and has all my alerting setup, and everything just the way I want it, with AD integration, etc. Then I have 10 devices that are production, mission critical, but no SNMP or ICMP. Wouldn’t it make sense to want to just add them to your existing monitoring service, instead of architect, design, test, stage, present, demo, and then move to production, a SECOND monitoring system, that will also cost more money, just to monitor those 10 servers? Why not just add them to your existing monitoring system instead?

murrant · 17 September 2017 23:23

I can still see this happening with high frequency:

I added a device, and set the OS, where are all my graphs?

Gorian · 18 September 2017 02:05

Well, Ideally it’d be an “opt-in” sort of thing, not an opt-out. I.E. you’d have to go out of your way to both change the OS and to disable SNMP, and we can even have some sort of warning if you want. “Warning: This feature may have unintended side-effects, may not graph what you want, yadda yadda”.

florianbeer · 18 September 2017 07:28

Well but LibreNMS is an SNMP monitoring system and that’s what we all use it for. If you want to integrate a proprietary (Is cloudwatch an amzon product? I’ve never heard of it.) protocol, then it would mean a pretty big re-write. Nothing is stopping you from doing it - LibreNMS is opensource, but still it deviates quite a bit from what it does right now.

Feels a bit like wanting to have your microwave be able to cut potatoes all of a sudden.

aldemir_a · 18 September 2017 14:27

I beg to differ. Many people (like me) use LibreNMS as their main monitoring platform. And I don’t enjoy adding another program to monitor my infrastructure. So while not ideal, we should introduce, wmi support, ping only devices etc. SNMP is the main thing for LibreNMS and which is fine, but in the real world, not every device uses snmp and we require a good program like LibreNMS to support these oddballs.
Maybe ping only devices should have their own submenu or something I don’t know. Best way is to discuss here and try to come out with the best solution.

Just my two cents

YukonRob · 18 September 2017 15:17

Corner case here but with the new ‘connected world’ this is a growing area. I look after all of the technology at a mine and while ensuring that the phones and internet work at locations dotted over 200 km² is my primary concern, I also need to monitor all of the other infrastructure (weather stations, plc’s, treatment plants, security cameras, fuel cardlocks, etc) for network connectivity. A majority of these do not have SNMP (or it is prohibitively hard to set up) but they do respond to pings. Having one monitoring platform would allow me to quickly and easily identify the locations of either communication or power failures because most of these remote sites only alarm via email which is only useful if they can communicate.
I have done some of this by using services but having each one as it’s own device would make troubleshooting much easier.

laf · 18 September 2017 17:13

You can say that about anything we provide

Although I don’t actually think we will, people can force add devices with invalid snmp creds and we rarely see questions about that.

Gorian · 18 September 2017 21:15

Right, but moreover, it’s a MONITORING platform. It seems to be, we aim to compete in the same space as nagios, observium, zabbix, etc. In a nice, idealistic world, I’d love to have an environment where everything is hardware in my own datacenter, and everything has SNMP, and it’s all monitored at the push of a button. If I was the CTO/CIO of the company I work for, I might push for that. But as it stands, I’m the DevOps Engineer in charge of infrastructure that was handed to me, and trying to move a cloud product with 90% of the infrastructure in AWS OUT of it, is pushing a boulder uphill. I can’t just deploy whatever I want, nor how many. I can’t just setup a nagios/zabbix/whatever else BESIDE LibreNMS because it’s not that simple.

It’s a production system, supporting millions of users, and I doubt that I’m the only one with such a use case. The reality is, you pick one system that does what you need it to. If LibreNMS did most of it, but not all, the response I would get is not “well, let’s just run LibreNMS AND nagios. Oh, also pay for X monitoring platform too”. Instead, we would pick the product that can handle all of our needs.

The smart thing to do is to make sure that our product will be the one that people pick because it can handle these things, otherwise other people in my situation will choose a different product entirely. Saying “Well, you can just spend another 6 months to create a new Proof-of-concept, learn how to eliminate all single points of failure to handle n servers down for 24/7 fault tolerant monitoring, architect and design how to deploy it into the existing infrastructure, create all the pretty powerpoints and presentations to appease management, document everything, deploy to staging, get buy-in from everyone, and finally deploy to production” - this isn’t a realistic solution.

Yeah. It’s amazon’s monitoring platform for their own services, but fairly complicated and not terribly simple, doesn’t support non-aws services, etc. Best practice is to ingest it’s data into your own monitoring system to use it along side your monitoring of everything else.

Gorian · 18 September 2017 21:16

Yeah, exactly. Having a device that just displays as “up”, but has no ICMP, or SNMP, attach service checks to it, and then get service check alerts as normal is the easier, and seemingly best, way to move forward with this.

laf · 18 September 2017 22:51

I don’t think we need to debate whether we should support more than snmp, long term we should but we are obviously not in a position to do so right now.

Gorian · 18 September 2017 23:02

sure, but that PR is a good step forward, and something I can build on for more, is what I’m trying to say. So if we make sure that it is done in a way that support future code additions, I can built on it

murrant · 19 September 2017 00:26

The thing is some choices are practically permanent because of the amount of effort it takes to change them without breaking everything.

The main people that pay that price are the maintainers. That is why laf and I are resistant to a quick fix.

The solution needs to be thought out and this usually involves multiple code and ui iterations.