Hi Guys,
I’ve a lot of service checks of Nagios plugin, as for today I need to set alert per service check with the following statement %services.service_ip = “x.x.x.x”, in this case I’ll have to prepare alert per service with relevant IP
Is it possible to use one service check alert per all service checks?
Not sure what you mean by that. ^^^
But you could use this alert rule to check if a service is not working.
So if I use this service check status, one alert rule can cover all service check failures? in case I’ve fping service to 100 different destinations?
I’ve set the following, it doesn’t trigger the alert:
You didn’t type in the rule correctly try it again
%services.service_status != "0"
I tried to set "0"
it gives me the following %services.service_status != ""0""
with 2 times "
Okay, odd try this go into the alerts collection and use the alert rule that is in there, I know for sure it works.
Select the rule named “service up/down” you can name it to whatever you like.
i set this one, and it’s still doesn’t work
Im not sure then, cause it works just fine for me with service checks. Also did you not try the alert that was in the alert collection???
I used the alert in the log collection, it had additional macro command I removed it, but also tried to set with the macro, it didn’t work, my checks are based on fping, is this service check status should take it?
Maybe I case use this trigger ? services.service_message?
but in that case what variable I should use
Im using Fping service checks also it alerts on it.
i see in the service it shows red, but it doesn’t trigger the alert
Run service debug on the service that has failed.
post the ouput.
https://docs.librenms.org/#Extensions/Services/#debug
DEBUG!
SQL[SELECT * FROM devices
AS D, services
AS S WHERE S.device_id = D.device_id ORDER by D.device_id DESC]
Nagios Service - 1
Request: /usr/lib64/nagios/plugins/check_fping -H 8.8.8.8 -T 1000 -i 1000 -n 5
Perf Data - DS: loss, Value: 0, UOM: %
Perf Data - DS: rta, Value: 0.001950, UOM: s
Response: FPING OK - 8.8.8.8 (loss=0%, rta=1.950000 ms)
Service DS: {
“loss”: “%”,
“rta”: “s”
}
RRD[update /opt/librenms/rrd/sv3-librenms01.pan.local/services-1.rrd N:0:0.001950]
Sending sv3-librenms01_pan_local.services.services.1.loss 0 1515083968
Sending sv3-librenms01_pan_local.services.services.1.rta 0.001950 1515083968
SQL[UPDATE services
set service_message
=‘FPING OK - 8.8.8.8 (loss=0%, rta=1.950000 ms)’ WHERE service_id
=‘1’]
Nagios Service - 2
Request: /usr/lib64/nagios/plugins/check_fping -H 8.8.4.4 -T 1000 -i 1000 -n 5
Perf Data - DS: loss, Value: 100, UOM: %
Response: FPING CRITICAL - 8.8.4.4 (loss=100% )
Service DS: {
“loss”: “%”
}
RRD[update /opt/librenms/rrd/sv3-librenms01.pan.local/services-2.rrd N:100]
Sending sv3-librenms01_pan_local.services.services.2.loss 100 1515083974
Nagios Service - 3
Request: /usr/lib64/nagios/plugins/check_fping -H 10.106.10.1 -T 1000 -i 1000 -n 5
Perf Data - DS: loss, Value: 0, UOM: %
Perf Data - DS: rta, Value: 0.073200, UOM: s
Response: FPING OK - 10.106.10.1 (loss=0%, rta=73.200000 ms)
Service DS: {
“loss”: “%”,
“rta”: “s”
}
RRD[update /opt/librenms/rrd/sv3-librenms01.pan.local/services-3.rrd N:0:0.073200]
Sending sv3-librenms01_pan_local.services.services.3.loss 0 1515083978
Sending sv3-librenms01_pan_local.services.services.3.rta 0.073200 1515083978
SQL[UPDATE services
set service_message
=‘FPING OK - 10.106.10.1 (loss=0%, rta=73.200000 ms)’ WHERE service_id
=‘3’]
amaizenshtein:
Nagios Service - 2
Request: /usr/lib64/nagios/plugins/check_fping -H 8.8.4.4 -T 1000 -i 1000 -n 5
Perf Data - DS: loss, Value: 100, UOM: %
Response: FPING CRITICAL - 8.8.4.4 (loss=100% )
Service DS: {
“loss”: “%”
Status = Critical so it should alert on the rule.
I just tested mine with this alert rule and it works.
Alert Rule
The Alert
The Service Check
it works for other librenms servers , let me restart the server