Hi for some reason we are getting weird spikes in port usage on the Librenms Graphs. So far I noticed it on the 3650 and 3850 cisco switches. Each port is only a Gig interface with a 2Gbps port channel to a Cisco 3600.
The individual ports on a 3650 look like this.(1gbps max)
Hi @Kevin_Krumm I enabled the RDD Tuning and still see some traffic spikes on the Cisco 3850 and 3650s graphs. For some reason when they are in trunk mode to another cisco switch (ex.3560) Librenms shows spikes but not on the trunked connected switch. I see the same spikes to all the trunk switches but only on the 3850 side.
3850 Trunk Port
@Kevin_Krumm I did run both.
How do I use the removespikes.php, im not familiar with what values are needed.
I was just trying to figure out if it was actually a broadcast storm, but doesn’t seem so if its not on the reflecting switch.
Any chance you’ve had some luck with this? I’m seeing the exact same issues even with RRDTune enabled globally and am also struggling with the removespike.php script. It seems to be coming from a wide range of versions of the 3850’s for me.
You should make sure your polls are finishing as this was our issue. I can’t remember all the tuning I tried, but here’s a 60 second bash script I used to use to fix devices:
$ cat fix_em.sh
!/bin/sh
for i in /opt/librenms/rrd/my-brokendevice/*;
do
./removespikes.php -M=variance A=nan- R=$i;
done;
Feel free to put your device in it and mess with the options as necessary. (I was too lazy to use argv)
Ok, thank you. Ya, I’m seeing the same thing (only 3850’s and 3650’s) and haven’t been able to get the removespikes script to work either. I’ll post back if I find the magic.
I thought I was seeing gaps, but if I zoom in (shorten the time interval) it appears to be ok on mine. I think the spike is so much that the graph ends up with low spots that look like gaps.
Gaps like that are usually performance issue with the device you are monitoring (SNMP timing out) or your networking monitoring server is struggling to keep up. You will need to troubleshoot further check the performce docs and run a debug on that device and see if you have any time outs.