I’m in the process of migrate an instance of librenms to a new server and I’m testing the polling of some devices to see how they will work out.
I am going to use three servers, two of them will have a polling time of 5 minutes, and the last one will have a polling time of ten minutes, it will do the polling on some slow devices, with a lot of ports.
On this server i was polling a Palo Alto Box (PA-3050), without polling the ports and with a polling time of 5 minutes, it was working ok, I changed the polling time to 10 minutes and now I have gaps in the graphics, I already changed the rrd step in the config.php and ran the rrdstep.php script, but I still have gaps.
What can I do to do not have those gaps? The device is being polled in less than five minutes.
Are you sure polling is completing within 10 minutes in the first place?
You should also set the heartbeat value as well - have you done that, If so I’m guessing it’s set to 1200 seconds which in part is the length of time between those gaps. I’ve never tested longer time between polls as most people want to go the other way.
TL;DR - rrdtool was doing exactly what it was told and we were graphing the space between the polling and heartbeat (dead air).
This solution also fixes NaN or empty graphs. Had the issue where a blind adjustment from the default 300 seconds caused empty graphs, YET rrdtool was running flawlessly ( inside of ~/rrd/ running the rrdtool command resulted in OK with the output values). The timing and heartbeat, in our case, were exactly off to produce gaps that were the graph. All that said, the thing that kept us digging is that we were using InfluxDB -> Graphana and our dashboards were all still fully functional. We updated the values to be double on the heartbeat and then issued the ./scripts/rrdstep.php -h all command