Hi all,
I’m having a hard time wrapping my head around how to craft a particularly complex/weird alert rule, if it’s even possible.
We have a number of Ubiquiti AirFiber devices deployed across our network and I’d like to be able to craft a rule that will alert us when a given device is over 85% of available capacity. Unfortunately, the devices do not appear to self-report a utilization figure through SNMP. That would be too easy, I suppose.
The issue I’m running into is that available capacity is dynamic depending on environmental conditions, so I can’t just compare throughput to a fixed number. I need to compare two values from completely separate tables within libre and I’m not sure if this is even doable?
For a given device, I need to compare eth0 ifInOctets_rate*8 with the airfiber TX rate, and vice versa (outoctetsrate with RX rate). The AirFiber devices report traffic throughput on an interface that it names eth0 which has a defined “speed” of 10mbit/sec; as the connection is asymmetrical, I can’t even go in and manually tweak the “speed” as this would prevent us from being notified about congestion in the lower-bandwidth direction.
I can retrieve the TX/RX rates thus;
wireless_sensors.sensor_type = “airos-tx” && wireless_sensors.sensor_class = “rate” provides TX rate in bits/s
wireless_sensors.sensor_type = “airos-rx” && wireless_sensors.sensor_class = “rate” provides RX rate in bits/s
I don’t seem to be able to retrieve the rate number with a single line item (something along the lines of %wireless_sensors.airos-tx.rate?) due to the way wireless sensor information is stored; Here’s where I run into a wall, as my understanding of how the macro/alert system works is tenuous at best, so I don’t know where to go next, or if what I’m trying to do is even achievable.
Could someone help me out with creating a macro or series of macros that will return the current percentage utilization of this AirFiber link, if this is even doable? If I can get to the point where I can retrieve current capacity of the link and compare it to throughput on eth0, I should be able to get the rest of the way myself
A sanitized poller run for this device can be found here: https://pastebin.com/LEbty9b2 in case that’s of any use.
Thanks in advance for the help; if I’m entirely on my own or this would require major changes to a polling module please feel free to tell me to shove off, I’m just at a loss here.