RRD File Not Updating

Good afternoon,

We currently have some services configured via a Nagios plugin which graphs span loss on each ‘side’ (span) connected to the ROADM node. This is working great, but I noticed that we have one particular device in which graphs have not been updating after adding a new side to our ROADM node a few months ago. All graphing has now stopped for the service configured on that device. It appears that the RRD file is not actually updating. To start, here is the output of validate.php:


./validate.php
====================================
Component | Version
--------- | -------
LibreNMS  | 22.3.0-10-gcc7345d54
DB Schema | 2022_02_03_164059_increase_auth_id_length (235)
PHP       | 7.4.28
Python    | 3.8.13
MySQL     | 10.5.15-MariaDB-1:10.5.15+maria~bionic
RRDTool   | 1.7.0
SNMP      | 5.7.3
====================================

[OK]    Composer Version: 2.2.9
[OK]    Dependencies up-to-date.
[OK]    Database connection successful
[OK]    Database schema correct

I am able to replicate the service config for that device, which runs and updates the new RRD file as expected. The following is the debug output of the ./check-services.php command for the node in question: (Note, Service - 54 is the ‘old’ service, while Service - 74 is the ‘new’ service)

Nagios Service - 54  
Request:  '/usr/lib/nagios/plugins/check_ons_span_loss' '-H' '10.67.11.11'  
Perf Data - DS: A, Value: 13.2, UOM:  
Perf Data - DS: B, Value: 14.9, UOM:  
Perf Data - DS: C, Value: 15.9, UOM:  
Perf Data - DS: D, Value: 12.5, UOM:  
Response: OK  
Service DS: {
    "A": "",
    "B": "",
    "C": "",
    "D": ""
}  
RRD[last 10.67.11.11/services-54.rrd  --daemon unix:/var/run/rrdcached.sock]  
RRD[update 10.67.11.11/services-54.rrd N:13.2:14.9:15.9:12.5 --daemon unix:/var/run/rrdcached.sock]  

Nagios Service - 74  
Request:  '/usr/lib/nagios/plugins/check_ons_span_loss' '-H' '10.67.11.11'  
Perf Data - DS: A, Value: 13.2, UOM:  
Perf Data - DS: B, Value: 14.9, UOM:  
Perf Data - DS: C, Value: 15.9, UOM:  
Perf Data - DS: D, Value: 12.5, UOM:  
Response: OK  
Service DS: {
    "A": "",
    "B": "",
    "C": "",
    "D": ""
}  
RRD[last 10.67.11.11/services-74.rrd  --daemon unix:/var/run/rrdcached.sock]  
RRD[update 10.67.11.11/services-74.rrd N:13.2:14.9:15.9:12.5 --daemon unix:/var/run/rrdcached.sock]  

As you can see, both services run and appear to update the associated rrd file, but when I review the last changes to the RRD file, it does not appear to be making modifications:

-rw-rw-r-- 1 librenms librenms  510200 Mar  3 13:46 services-54.rrd
-rw-r--r-- 1 librenms librenms  679664 Mar 25 13:17 services-74.rrd

We had previously been graphing on sides A, B, and C, before adding side D, which is when the graphing stopped. While re-creating the service does appear to fix our issue, the point of this service addition is to have the ability to maintain historical graphing for each side. I’ll admit I’m fairly ignorant when it comes to the RRDtool, so I may just be missing something. Is it possible the service graphing stopped due to adding a new field, which is side ‘D’? I did notice that we’re having issues with RRDtune updating a few interfaces that are over 100G as well, so is it possible I have something misconfigured? If there is other command output that would be helpful, I would be happy to get it. Any help would be appreciated!

FYI, rrdcached will not immediately flush the changes to disk.
So checking the last modified time is not as interesting as you might think.

That being said, it is clear that LibreNMS sends the command to rrdcached to update the rrd file. So, check rrdcached.

Other than that, nothing looks wrong of the information you have shown. Other than the 54 file being last modified 22 days ago.

One thing you could try is to delete the 54 rrd file (or move it) and let LibreNMS create a new rrd file for it.

Hey murrant,

Thanks for providing insight, that makes sense. I’ve moved the 54 rrd file and it looks like LibreNMS created a new 54 rrd file and is now populating the graphs for sides A, B, and C, however, side D is still non-existent in the service graphs for that device/rrd file. Of course the historical graphing has gone away away, but that was expected after moving the original file. Following up on my RRD ignorance, how does one go about checking rrdcached to see if there are issues? I’m going through the documentation on the RRDTool to get myself educated, but I’m hoping there’s something simple I’m missing here that could be pointed out.

I did a little digging and used the rrdtool fetch command to verify the services-54.rrd file is updating as expected. Here are the results:

rrdtool fetch services-54.rrd AVERAGE
                              A                   B                   C                   D

1648524600: 1.3300000000e+01 1.4900000000e+01 1.5800000000e+01 1.2500000000e+01
1648524900: 1.3222333333e+01 1.4822333333e+01 1.5877666667e+01 1.2500000000e+01

Several lines of that were omitted due to length, but it does appear the newly created 54 rrd file is updating with the additional D side values as intended. The graph for that side, however, is still not present in LibreNMS. I’ve also reviewed the old services-54.rrd file and verified that it was not updating values or the newly added side D data source.

rrdtool fetch services-54.rrd.old AVERAGE
                              A                   B                   C

1648443000: -nan -nan -nan
1648443300: -nan -nan -nan

I’m still muddling my way through the RRDtool guide, but for some reason the addition of a new data source for the rrd file appears to be hosing up the ability to write any data to the service rrd files. I just wanted to update the post with the new discovery, I still have to dig into the rrdcached portion of this. Any additional thoughts or insight here would be greatly appreciated.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.