Continuing the discussion from Very spiky graph on one host:
I also see such spikes, destroying port utilisation graphs. As I never saw such spikes on other NMS, I suggest using that approach:
- Validate in/out octet values before storing them (cannot be not above the interface speed)
- handle “getting re-alive after outage” situations by:
For those who want to recover asap and are fine with some incorrectness:
- spread the newly read counter octet value evenly across all the last polling intervals (x) were no data was received and use only that 1/x share of octets for the current interval as an approximation.
or, for those who need correct graphing or otherwise nothing:
- keep the interface or whole device as unreachable for one additional polling period, store the new octet counter value but do not graph it. Use that updated counter value to calculate the correct Bytes from next polling period onward.
BR