Trying to monitor SFP degradation on fiber connections

Hi Y’all,

I’m trying to monitor the degradation of the SFPs used in our backbone connections. I’ve been looking into the collection of alert rules for sensor_limit_low, but that’s not enough: I’d like to know, when the degradation starts (not, when the devices thinks, it’s time to exchange the SFP).

I’ve been tinkering along the edges of curve sketching on the dBm values of the corresponding RRDs, but didn’t find anything inside LibreNMS, that could aid me on that trip. Before pulling out rrdtool and trying to bring the outcome of a derivation back into LibreNMS, I’d like to ask the community:

How are you monitoring the degradation on your fiber connections? Is it more than checking on sensor_limit_low (to see if the dampening crosses a certain value)? Any recommendations on how to spot the beginning of the degradation?

Thanks.


validate.php

Component Version
LibreNMS 22.2.2
DB Schema 2021_12_02_113537_ports_stp_designated_cost_change_to_int (234)
PHP 7.4.6
Python 3.6.15
MySQL 10.5.15-MariaDB
RRDTool 1.7.2
SNMP 5.7.3

====================================

[OK] Composer Version: 2.2.9
[OK] Dependencies up-to-date.
[OK] Database connection successful
[OK] Database schema correct
[WARN] Your local git contains modified files, this could prevent automatic updates.
[FIX]:
You can fix this with ./scripts/github-remove
Modified Files:
html/.htaccess

Not sure this is something Libre would be capable of. You would be looking for an increase in the rate of change and then triggering on a threshold. The question then becomes what rate of change to set to trigger an alarm, and I don’t really have an answer for that. I imagine different failure modes will result in different RoC.

(The rate of change in a somewhat simple graph would be the first derivation of the function that describes the closest the values in the RRD. I figured, that is something I’d have to do outside LibeNMS. And trying to find a function that best describes a graph, I didn’t wanna go into that yet …)

For alerting - easy to do with custom sql alerting and comparing values.
I think that making assumptions that if signal drops for example for 3dBm - it’s a lot.
In SQL: sensor_current and sensor_prev are your best friend. Making it work with graphs - you’ll have a pretty decent tool to work with.

Alternatively - simple external bash script working in background which copies sensor_current value to a file/other sql table with data signature. It can easily check like 20 last values, calculating an average and checking again whether ‘newest’ reading is less than - for example 3dBm. Easy to do I think.

For SQL alerting:

select sum(sensor_current - sensor_prev) as ‘sig_diff’ from sensors where sensor_id=‘7700’ having sig_diff > $(value_change_to_be_interested).

Try it. Should work well.

3 dBm is a good point. I’m gonna hit that and a drop in speed of the port channel. Gonna come back with the outcome.

On how to measure and evaluate the long term development of a degradation, You suggest copiing the values to a text file and running an comparison on the raw data - I thought about the same, but doing that within LibreNMS and using the data stored in the corresponding RRD, as it is already stored (and consolidated) raw data.

Thank Y’all or the hints and ideas, gotta ponder and test a bit. Will be back.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.