Unable to get fast ping checking working, not sure where I am going wrong with the instructions

I followed the documentation for fast ping checking (1 minute) and so far it appears to not be working for me. Devices are only discovered down after the ICMP check with the polling cycle.

Here are my config.php changes:

$config['fping'] = "/usr/sbin/fping";
$config['ping_rrd_step'] = 60;
$config['fping_options']['retries'] = 2;
$config['fping_options']['timeout'] = 500;
$config['fping_options']['interval'] = 500;

I ran the “./scripts/rrdstep.php -h all” command.

Here is my /etc/cron.d/libernms file:

*    *    * * *   librenms    /opt/librenms/ping.php >> /dev/null 2>&1
33   */6  * * *   librenms    /opt/librenms/cronic /opt/librenms/discovery-wrapper.py 1
*/5  *    * * *   librenms    /opt/librenms/discovery.php -h new >> /dev/null 2>&1
*/5  *    * * *   librenms    /opt/librenms/cronic /opt/librenms/poller-wrapper.py 16
*    *    * * *   librenms    /opt/librenms/alerts.php >> /dev/null 2>&1
*/5  *    * * *   librenms    /opt/librenms/poll-billing.php >> /dev/null 2>&1
01   *    * * *   librenms    /opt/librenms/billing-calculate.php >> /dev/null 2>&1
*/5  *    * * *   librenms    /opt/librenms/check-services.php >> /dev/null 2>&1
*/5 * * * * librenms /opt/librenms/html/plugins/Weathermap/map-poller.php >> /dev/null 2>&1
15   0    * * *   librenms    /opt/librenms/daily.sh >> /dev/null 2>&1

I’m not sure where I am going wrong with the instructions or if there is maybe something else wrong with my install. I also have this issue with the validate script.

====================================
Component | Version
--------- | -------
LibreNMS  | 1.51-70-g83522c6
DB Schema | 2019_02_10_220000_add_dates_to_fdb (132)
PHP       | 7.2.17
MySQL     | 5.5.60-MariaDB
RRDTool   | 1.4.8
SNMP      | NET-SNMP 5.7.2
====================================

[OK]    Composer Version: 1.8.5
[OK]    Dependencies up-to-date.
[OK]    Database connection successful
[OK]    Database schema correct
[FAIL]  Some folders have incorrect file permissions, this may cause issues.
        [FIX]:
        sudo chown -R librenms:librenms /opt/librenms
        sudo setfacl -d -m g::rwx /opt/librenms/rrd /opt/librenms/logs /opt/librenms/bootstrap/cache/ /opt/librenms/storage/
        sudo chmod -R ug=rwX /opt/librenms/rrd /opt/librenms/logs /opt/librenms/bootstrap/cache/ /opt/librenms/storage/
        Files:
         /opt/librenms/bootstrap/cache/packages.php

This permission issues pop up for me once in a while. I run the fixes then run the validate script again and it comes back with everything “OK”. If I run the script again in a few days, it’ll tell me that I have those same permission issues again.

Try running ping.php by hand with the -d option

Permissions issue is known, it is not the same files. It is cache/session files created by the web server.

I ran /opt/librenms/ping.php with the -d option. The output below is when everything is running normally and nothing is down.

SQL[select `devices`.`device_id`, `hostname`, `status`, `status_reason`, `last_ping`, `last_ping_timetaken`, `max_depth` from `devices` left join `devices_attribs` on `devices`.`device_id` = `devices_attribs`.`device_id` and `devices_attribs`.`attrib_type` = ? where `disabled` = ? and (`devices_attribs`.`attrib_value` is null or `devices_attribs`.`attrib_value` != ?) order by `max_depth` asc ["override_icmp_disable",0,"true"] 15.55ms]

'fping' '-f' '-' '-e' '-t' '500' '-r' '2'
dbserver72.name.sanitized.net is alive (0.29 ms)
SQL[update `devices` set `last_ping` = ?, `last_ping_timetaken` = ? where `device_id` = ? ["2019-05-23 09:30:37","0.29",28] 1.28ms]

RRD[update /opt/librenms/rrd/dbserver72.name.sanitized.net/ping-perf.rrd N:0.29]
Recorded data for dbserver72.name.sanitized.net (tier 0)
esxi2.name.sanitized.net is alive (0.92 ms)
SQL[update `devices` set `last_ping` = ?, `last_ping_timetaken` = ? where `device_id` = ? ["2019-05-23 09:30:37","0.92",10] 11.56ms]

RRD[update /opt/librenms/rrd/esxi2.name.sanitized.net/ping-perf.rrd N:0.92]
Recorded data for esxi2.name.sanitized.net (tier 0)
192.168.1.1 is alive (3.02 ms)
SQL[update `devices` set `last_ping` = ?, `last_ping_timetaken` = ? where `device_id` = ? ["2019-05-23 09:30:37","3.02",46] 11.61ms]

RRD[update /opt/librenms/rrd/librenms.name.sanitized.net/ping-perf.rrd N:0.05]
Recorded data for librenms.name.sanitized.net (tier 0)
Pinged 23 devices in 0.70s

The beginning and end of the output is above. Not sure if you wanted the entire output or not, but I didn’t see anything that looked like an error.

I waited 1 minute after a polling cycle and shut a host down. Then, I ran ping.php again manually. ping.php -d output shows that one host is unreachable and it generates an event in the Eventlog. I have an alert based on that event and it works.

So it appears that ping.php is not being run on it’s own through the cron. When I run it manually, it works as I think it should. But it isn’t being activated through the cron for some reason.

I added the below two lines to the cron config file (/etc/cron.d/librenms) and all works well now.

SHELL=/bin/bash
PATH=/opt/librenms:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin

Fix found in Cron job for fast-ping (ping.php) don't work correct

Thanks for steering me in the right direction!

Thank you, Who’s the man? You the man! :slight_smile: