-NAN values in ping graphs

I’m running into issues where I’m getting -nan for my ping graphs. I’m running a distributed poller setup with rrdtool_version 1.7.0 and fast ping checks every 30s. My validate.php on both the poller and the main install check off fine, and I’ve confirmed that the graphs draw for some of the devices and fail for others.

Also potentially unrelated, but I noticed that if I add the device from the poller (via python3 snmp-scan.py to auto-scan subnets), it’ll only create the files in the /opt/librenms/rrd directory of the poller. However, if I add them on the main server, it’ll create the files on the main server’s rrd folder. It’s an issue because graphing fails when data isn’t on /opt/librenms/rrd of the main install. In distributed polling, is data written directly to the main server’s /opt/librenms/rrd directory or is it flushed from the poller’s /opt/librenms/rrd at some point in time? Is that expected functionality or is there something I can do to resolve this?

I confirmed that all the relevant directories are in /opt/librenms/rrd since I force-added on the main server. However, I’m seeing some ping graphs with nan entries.

This is the rrd command and output:

RRDTool Command

rrdtool graph /tmp/bRttCz6eiooYKFz4  -l 0 -E --start 1597810453 --end 1597896853 --width 1488 --height 300 -c BACK#EEEEEE00 -c SHADEA#EEEEEE00 -c SHADEB#EEEEEE00 -c CANVAS#FFFFFF00 -c GRID#a5a5a5 -c MGRID#FF9999 -c FRAME#5e5e5e -c ARROW#5e5e5e -R normal -c FONT#000000 --font LEGEND:8:DejaVuSansMono --font AXIS:7:DejaVuSansMono --font-render-mode normal DEF:ping=100.64.1.3/ping-perf.rrd:ping:AVERAGE 'COMMENT:Milliseconds      Cur      Min     Max     Avg\n' LINE1.25:ping#36393d:Ping GPRINT:ping:LAST:%14.2lf  GPRINT:ping:AVERAGE:%6.2lf GPRINT:ping:MAX:%6.2lf  'GPRINT:ping:AVERAGE:%6.2lf\n' --daemon localhost:42217
RRDTool Output

1569x369
OK u:0.04 s:0.01 r:0.05

Any idea what might be going on here?

UPDATE

I went ahead and deleted the existing ping-perf.rrd files for the problematic devices and re-ran polling for the same devices. It seemed to resolve the issue (for now), but I’d still like some insight on how the ping-perf.rrd files got into a bad state in the first place.

I am experiencing this as well. Installed LirbeNMS via docker using the compose file on the official repo. Fast ping is enabled at 60s. Some devices work, some don’t.

EDIT:
Below are the graphs of the pings of two devices over the same period. One populated correctly, the other missing most of its history. Both devices have 100% uptime over this period, confirmed by a separate ping test.
To recreate:

  1. Run official docker-compose
  2. Enable fast ping and set period to 30s
  3. Update ping_rrd_steps in config.php
  4. Add devices

Anyone know of a way to troubleshoot this?

EDIT:
After having polling running for a few hours, all devices show 100% uptime despite graphs with empty areas. I guess this would indicate polling is fine and RRD is the issue. Also interesting is there is a device present with 600+ ms ping on the graph which should have triggered an alarm but did not :thinking:

Screen Shot 2020-12-30 at 09.07.02

`====================================

Component Version
LibreNMS 21.8.0-26-g4f50c3c05
DB Schema 2021_08_26_093522_config_value_to_medium_text (217)
PHP 7.3.29-1+ubuntu18.04.1+deb.sury.org+1
Python 3.6.9
MySQL 10.5.11-MariaDB-1:10.5.11+maria~bionic
RRDTool 1.7.0
SNMP NET-SNMP 5.7.3
====================================

[OK] Composer Version: 2.1.6
[OK] Dependencies up-to-date.
[OK] Database connection successful
[OK] Database schema correct

Hi all. I have the same problem with ping graphs and pooler graphs.
How can I fix this?


image

Thanks

This topic was automatically closed 730 days after the last reply. New replies are no longer allowed.