I created a nagios plugin which returns nagios alerting scheme (1:ok , 1:warn and 2 for crit)
Then I created 2 services reusing same script passing an ip as parameter.
Finally I created an alert rule like:
services.service_status != 0 AND macros.device_up = 1 AND services.service_type = “bgp_peers_mkt.php”
then mi template looks like:
{{ $alert->title }}
Rule: @if ($alert->name)
{{ $alert->name }}
@else
{{ $alert->rule }}
@endif
@if ($alert->state == 0)
Time elapsed: {{ $alert->elapsed }}
@endif
Timestamp: {{ $alert->timestamp }}
Unique-ID: {{ $alert->uid }}@if ($alert->faults)
Faults:
@foreach ($alert->faults as $key => $value)
{{ $key }}:
Severidad: @if ($value[‘service_status’]==2) Critical @else Warning @endifDescription: {{ $value[‘service_desc’] }} - ( {{ $value[‘service_ip’] }} )
Message: {{ $value[‘service_message’] }}@endforeach
@endif
Ok … now problem is that When some services goes from critical to normal situation.
I keep receiving notifications that service is critical.
After analizing both, test-template.php and check-service.php scripts outputs noticed that.
on test-templates.php I have:
[service_status] => 3
[service_message] => Service not yet checked
but check-services.php output does:
Nagios Service - 38
Request: ‘/usr/lib64/nagios/plugins/check_bgp_peers_mkt.php’ ‘-H’ ‘172.17.50.1’
Perf Data - None.
Response: OK - bgp status is ok (established) on 4 peers
So … why Im getting this difference ?
Im sure it was workink ok , I have tested a lot.
Is it due to a database? What can I do ?
btw:
There is not permission problem.
check_services -d shows all ok.