Facing a weird issue here and hoping to get help. My rrdtool is only storing NaN for any devices i have added in the last few months, devices prior to that(almost 80) are working fine still to this day. I have looked at permissions and config and i cannot justify how this is happening?? When i go into RRD, go to a device i recently added which I know is pushing 5+ gb of traffic, and run rrdtool dump to get an xml I see this on all the graphs…
NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
[root@librenms librenms]# su librenms
bash-4.2$ ./validate.php
When I look at the debug output for one of the devices not graphing it looks like it is properly getting the data, I will post a small snippet here without any identifying info, if someone needs more info i can share it privately
I too am at a loss, I was really hoping to avoid having to do a fresh install due to the time requirements to get my 90+ devices loaded back into a clean install. But it sounds like thats where I am at. Any easy way to export/import device ip and community info? Otherwise I will just plan on spending a day building a new image
I’m not really sure a re-install would be of any use but if you do, you may as well just dump the db and re-import it, copy the rrd + config.php. None of that should effect the rrd files being updated.
I have very much the same problem and it is blocking deployment.
In my instance it seemed closely associated with the installation of the check_mk agent. That may be coincidence.
This problem seems to appear in some form, in other systems (apart from librenms) that run rrdtool.
This is running in Ubuntu 18.04 - the flavour may be important - on a RHEV vm. I suggest that we
I have checked with tcpdump to see data flowing into librenms, I have checked by running the scripts manually on localhost. validate is uninformative. I have tried wiping the cache so it can be re-established and the data re-established (per the rrdtool dump) as NaN.
I definitely need some help with this,
Reinstalling from scratch when (not if) it breaks again isn’t the best option. One thing that occurs to me is that the database is invoked in part, to do this work. I may remove some (all) devices and see if it can be resync’d .
So far I have had no luck but I have noticed that I cannot add a device using check_mk and not snmp.
In other words, if I add a device with the intention of getting data solely from the unix_agent, it is not possible to enable the agent for the device? This is probably a separate question, but it appeared here as I started out from the point of view that it was somehow getting confused by having both running.
Not working yet. It is applications that are broken for this installation, the system health obtained through snmp has worked… so far.
Definitely, if I enable the agent but disable the snmp I get nothing. This could be a red-herring but that connection is unusual.
NaN comes from data returned as a string when numbers are expected, or divide-by-zero… and it is instructive that a “fresh install” didn’t have the problem.
Clearing individual (broken) servers from the system did not fix this. Going with a snmpd only configuration now on a “new” server. SNMPD did “discover” the mysql in place there. I have to wait some time to see if it gets data for it.
Nope
Have a notion that there is another way to get it, if there are strings being parsed and the data being returned isn’t in a compatible character set. The librenms mariadb is utf8mb4 - not sure what happens in the data for rrdtool - but this is stubborn as.
Trying a hexdump on the rrd entries leads me to believe this isn’t an easy problem. I cannot (yet) see a problem with the file that returns the NaN entries on dump. The NaN appears in that rrd dump however, on an unrelated server that has had a separate installation of rrdtool.
Fundamentally no version of this gets mysql information back into the system.
Scraped out the server and reinstalled. System cannot see mysql data, but detects that the app is installed, on both the check_mk and snmpd installations. Both scripts run without trouble locally and return valid data. The System data is returned OK, app data seems to be toast.
Can I install something else that has some apps on it? We’ll try the postgres scripted installation… and we see that it, and its associated ntp client are both running. This problem actually appears to be quite specific. Mysql, and since I added the postgres server after the two mysql boxes we know it is limited to that display.