I have a question about how the Nagios service checks trigger a down. From what I’ve been able to tell, it triggers on “OK:” to show green/good, but I am testing the “check_rbl” plugin which is coming back with an OK (notice there’s no colon after OK) and LibreNMS is tagging it as a down.
I ran a diagnostic -d on the check-services.php and it comes back with an OK, but the only difference on the results line is that this OK doesn’t have a colon immediately following the OK. I was wondering if that’s what is triggering the down?
I’m just trying to see if it’s something in LibreNMS or if its something I need to take up with the guy who wrote the check_rbl plugin. I’ve been using this on Nagios for quite a while without any issue, but as I’ve been porting my Nagios checks over to Libre, this one popped up as a problem child.
I think it’s the actual status of the script we check, i.e run check_rbl with the params used and then do echo $? and see what is output.
You should pastebin the outpug of ./check-services.php -d
The funny thing is, the results are coming the same either way that I run the plugin - it comes back OK. Due to the sensitivity of the IP addresses and our company information, I’ve blanked out that data in this screenshot:
Not sure what else to suggest. You could try and disable that cron entry, run the update and see if it stays green. If it doesn’t something else is updating the db.
Well, i know that I’ve run the commands specified as part of the installation document because I missed them initially and posted about it a week or two ago.
I just turned off SELinux and rebooted to see if that would make any difference, but so far, nothing.
I also changed the location of the .ini file to live in the /opt/librenms folder in case there’s some where permissions issue, but again, nothing. It is still erroring out, but when I run ./check-services.php -d manually, it comes back okay.
I also tried chmod 777 on the ini file to make sure the world had access to it in case that was the problem, but it’s still throwing the same error.
Surprise… I got it working.
It seems it’s something to do with the check_rbl plugin. The latest version seems to have an issue with LibreNMS’s way of polling the service because as soon as I put in a much older version of the plugin in, it worked fine.