Improve Wireless collection and graphing code


#1

I’ve noticed the WiFi code is a bit all over the place, and could do with standardising and sanitising.

I’ve been working on a PR to bring WiFi statistics for Mikrotik’s to the ports page, as this is probably where it is best suited. I want to make it fairly sane, and none vendor specific where possible.

The problem I have reached, is I don’t know much about WiFI, or what metrics are important (and in some cases, what they even mean!)

Things that the Mikrotik export, that may be useful are:

  • Client count
  • Noise Floor
  • TxCCQ
  • Rx/Tx Rate (I’m unsure how this is different to the interface rate, and is indeed always 0 on the 2 Mikrotik’s I have)

My questions are:

  • Are these standard metrics that make sense for every vendor?
  • Are there any more metrics that other vendors export that might be useful?
  • Are there any metrics not being exported by vendors that might be useful?

#2

I’ve actually started on implantation of more wifi code (non-snmp polling) and noticed the same. I was going to tidy up the polling code into something more sane along the way but would love for someone else to do that :slight_smile:

As for metrics, I’m non-wifi as well so can’t say what’s best / standard.


#3

I’d say (as someone who is mostly wireless), that having the wireless stats on the ports page would be poor.

Generally there is a wireless tab under graphs which has all of the stats together.

Few points I’d suggest to begin :

  • Generally there are two distinct classes of wireless devices:

  •    Access Points (or master on a point to point)
    
  •    Subscribers (or slave on a point to point)
    

    (This can be confused by the fact that on high end enterprise point to point links, there is no master/slave, both ends are equal)

  • As a general rule, on an access point you are interested in:

  • Noise floor

  • Client count

  • GPS / sync status (if available)

  • Then depending on the device (ubnt, cambium, mikrotik, etc) there are different metrics eg:

  • Ratio of sends to resends
    (on ubnt and mikrotik? known as Tx CCQ and Rx CCQ)
    (on cambium would have to be calculated from raw stats)

  • TDD Frame utilization
    (on cambium either available directly on PMP450s or calculated from stats on ePMP)
    (on unbt, referred to as airmax capacity)
    (probably something to do with nstream on mikrotik)

  • error counts (probably mostly applicable to point to point)
    (cambium pmp450 has BER (background error rate)
    (cambium also seems to have FEC and CRC errors but I haven’t seen any so not sure if real)
    (enterprise gear counts MSE (mean sq error) and BER as well as (Seriously)ErroredSeconds but that’s a different story)
    (ubnt annd mikrotik probably have error counts also)

  • On an AP it would be nice to see the spread of:

  • signal levels

  • chain imbalances

  • data rates
    (most likely these would have to be calculated / munged to be representable in a sane manner (eg AP with >100 clients)

  • Client units are interested in:

  • received signal level

  • signal to noise ratio

  • color codes (on some cambiums)

  • Tx and Rx rates

  • Tx CCQ and Rx CCQ

  • Error rates (if applicable)

  • On a point to point link a mix of all of the above needs to be shown for each end.

Aside from all of the above there is certain other data which would be nice, eg frequency, link-ids, etc which are not really graphable items and fall more into the infrastructure management domain.


#4

Comment from github:

"Currently the Unifi poller assumes that devices have 2 radios. Some APs however only have one (2.4GHz), and so the poller errors when trying to get the Radio Capacity for those devices.

Notice in the poller output:
Error in packet
Reason: (noSuchName) There is no such variable name in this MIB.
Failed object: unifiRadioCuTotal.1

If the relevant portions are removed from /includes/polling/mib/ubnt-unifi-mib.inc.php then the result is accurate for a single radio:
unifiRadioCuTotal.0 = 13
unifiRadioCuSelfRx.0 = 3
unifiRadioCuSelfTx.0 = 8
unifiRadioOtherBss.0 = 0

This also means that the Wireless Clients graph always shows 2 Radios, even though the second one will always have values of 0.

I don’t have an AP with 2 radios to test, but I think the number may be available by looking in unifiRadioTable ( .1.3.6.1.4.1.41112.1.6.1.1)"

https://github.com/librenms/librenms/issues/5696


#5

Comment from GitHub:

"Hello there everyone. So I was looking at some more things that could potentially be done to add graphing functionality to Mikrotik specific routers. In this case it would be ability to automagically detect and graph out similarly to the Cisco class based QoS (#1851).

Basically if it’s possible to automatically populate via the given OIDs of configured queues on a Mikrotik router. I’m sure it wouldn’t be too difficult to setup as all the OIDs are pretty easy to get. Anyway, let me explain the way Mikrotik RouterOS does this kind of graphing.

There are 2 types (probably 3 if one adds PCQ) of QoS/CoS able configurations on Mikrotik. Simple Queues and Queue Trees. Simple Queues are similar to just regular QoS/CoS as would be configured on a Cisco or Juniper. Queue Trees are more along the lines of Hierarchical QoS. They are both graphable and if possible can I make a request for them to be different windows and not aggregated under LibreNMS. The reasoning for this is because it will allow for easier view of the QoS statistics within a Mikrotik router.

Here (http://pastebin.com/4bhydXkF) is the output from the router. This is how the router exports a Simple Queue OID chart. I don’t know if it’s ok for me to do this in this little section but I’d like to just take one example and show the information that is given per configured line. Here we go:

name=.1.3.6.1.4.1.14988.1.1.2.1.1.2.12
bytes-in=.1.3.6.1.4.1.14988.1.1.2.1.1.8.12
bytes-out=.1.3.6.1.4.1.14988.1.1.2.1.1.9.12
packets-in=.1.3.6.1.4.1.14988.1.1.2.1.1.10.12
packets-out=.1.3.6.1.4.1.14988.1.1.2.1.1.11.12
queues-in=.1.3.6.1.4.1.14988.1.1.2.1.1.12.12
queues-out=.1.3.6.1.4.1.14988.1.1.2.1.1.13.12

Here (http://pastebin.com/HSLH2kL7) is the actual SNMP walk of those values.

As can be seen here, the actual OID for the specific configuration is OID number 12 (that last 12). So that means one can get the name for a “Simple Queue” 1.3.6.1.4.1.14988.1.1.2.1.1.2.[12] and one can get the bytes in for that simple queue at 1.3.6.1.4.1.14988.1.1.2.1.1.8.[12]. As can be seen there are a few things that can be searched upon here for said “Simple Queue.” As long as the last OID matches, one will get the information for this OID. Seems to make sense. Then one can generate the X and Y of the graphs fairly easily. Maybe one graph can have bytes in/out. Another could have packets in/out. From the above I would like to ask to NOT graph the “queues-in” and “queues=out” as it returns a 32 bit counter with the integer of 0. I am not sure what it is supposed to return. So for what it’s worth lets avoid those.

The next type of QoS/CoS is what Mikrotik calls the Queue Tree. This is equivalent to a Hierarchical QoS configuration on a Cisco or Juniper.

Here (http://pastebin.com/4TwezwUX) is the output from the router. This is how the router exports a Queue Tree OID chart. I will use the first example and format it for easier view as above:

name=.1.3.6.1.4.1.14988.1.1.2.2.1.2.16777225
packet-mark=.1.3.6.1.4.1.14988.1.1.2.2.1.3.16777225
bytes=.1.3.6.1.4.1.14988.1.1.2.2.1.7.16777225
packets=.1.3.6.1.4.1.14988.1.1.2.2.1.6.16777225
queues=.1.3.6.1.4.1.14988.1.1.2.2.1.8.16777225

Here (http://pastebin.com/jvei7pAp) is the actual SNMP walk of those values.

The information here is similar to the Simple Queue outputs from above. The last OID is the identifier of the actual “Queue Tree” queue. It seems that these OIDs start at above value 16,777,216 (2^24). The difference here is that there is only one traffic value, compared to the two traffic values for the “Simple Queue” queue. Here they are only outbound traffic, whereas “Simple Queue” can be both for inbound and outbound. So the bytes and packets signify outbound traffic only. The packet mark OID seen here actually identifies an internal router configuration name that is used to match traffic. If possible it should be listed so that there’s information on which configured marking firewall filter is used. However this should be very similar and straightforward as above for the “Simple Queue”. Again, we can avoid the queues OID as it returns another 0.

Please let me know if more information is needed and if I was not verbose enough with what I have given here. If need be I can give access to the router with this information if more polling is required by someone who chooses to develop this solution.

Thank you everyone :)"

From: https://github.com/librenms/librenms/issues/5553


#6

Comment from GitHub:

"Hi everyone.
I’m using several types of Mikrotik boxes (RB800, RB2011UiAS-2HnD, RB2011UAS-2HnD) as AP.
When I’m pooling it in wifi poller module checks only ClientConunt:
http://pastebin.com/t3C5QWPz
Every time it says that “No Such Instance currently exists at this OID” and its true, but when I’ve walk mtxrWlApClientCount.X where X is wireless ifnumber it returnes value - every time on any of my boxes. X is random (I think so) for every box.

It would be nice if there was support for all wireless section from Mikrotik mib, something similar like wireless section polled from Ubiquiti box (Graphs->Wireless). I’ve read a lot about that and peoples are looking for support that.
In mikrotik-mib wireless tree are are two sections:
WlRtab - supports mostly client connection parameters
WlApTab - supports wireless interfaces parameters
I’m interested in second one. Snmpawlk: walk.txt There is a data for every wireless interface TxRate, RxRate, SSID, BSSID, ClientCount, Freq, Band, NoiseFloor, OverallTxCCQ, AuthClientCount.

Is there a chance to develop that support?"


#7

Hmm, I didn’t know there was a single band unifi device which responded to the UBNT-UniFi-MIB. However the new UAP-AC-HD has 3 radios (2.4 / 5 / Management) so at some point dynamic checking of the number of devices should probably happens. Sadly that’s beyond what I’m capable of coding; I just wanted to add the note here.

Unifi Radio Table will list a separate device for every radio.

.1.3.6.1.4.1.41112.1.6.1.1.1.2.0 = STRING: “wifi0”
.1.3.6.1.4.1.41112.1.6.1.1.1.2.1 = STRING: “wifi1”
.1.3.6.1.4.1.41112.1.6.1.1.1.3.0 = STRING: “ng”
.1.3.6.1.4.1.41112.1.6.1.1.1.3.1 = STRING: “na”

You may also want to look at the wireless client count code. It goes through and creates an arbitrary number of radios (so I believe that will already support 3 radios). It however has the issue that it’s code only checks to see if a devices is “na” or not so may be putting management radio clients into Radio2. That’s kind of a minor thing for most people I gather.


#8

Interface speed would be awesome to see. Was just looking at the ports polling stuff as it is something I would like to see graphed for when it comes to clients connecting via wireless.

The big issue I am seeing as of currently is there is no standard way to get signal info via net-snmp for Linux and FreeBSD systems with out relying on a extend.


#9

To follow up on this, recently started work on this…

This is also perfectly cross platform for any unix system that uses wpa_supplicant as it just uses wpa_cli.


#10

Making some good progress here:

Feel free to test. I’m only going to add support for a limited amount of sensors and OS at the start, but once the main PR gets merged, we can add more.


#11

Here some information about the Cisco Wireless LAN Controller (WLC):

  • Total number of clients:
    CISCO-LWAPP-SYS-MIB - clsMaxClientsCount - 1.3.6.1.4.1.9.9.618.1.8.12.0

  • Total number of APs:
    CISCO-LWAPP-SYS-MIB - clsSysApConnectCount - 1.3.6.1.4.1.9.9.618.1.8.4.0

  • It would be very nice to have the ability to get clients per SSID. With this value it’s possible to count the number of clients by SSID and graph the value because it’s not possible without coding but we can get a list of all SSID associations by using this :
    AIRESPACE-WIRELESS-MIB - bsnMobileStationSsid - 1.3.6.1.4.1.14179.2.1.4.1.7

Now the WLC has a tab for Access Points. This tab includes information for each AP, what it will happen with it?


#12

I plan to revamp the Access Points tab to allow it to take advantage of the new code.

I’m not removing anything until it is fully replaced.


#13

I need some feedback please:

I have power and signal right now, but signal is just receive power. Should I leave them separate or combine them?

What terminology should I use. In fact if someone wanted to review all the current sensor types in the PR right now that would be great. Easier to change now :slight_smile:

There is a good description of each sensor type in LibreNMS/Interfaces/Discovery/Sensors/

Default thresholds (per device os) are something I can’t guess and need feedback on. These are used for alerts.

Also some better icons would be good too. See the list in LibreNMS/Device/WirelessSensor.php

@barryodonovan and @Dave_Bell


#14

The first integration for the Cisco WLC was tested and it’s working fine. The AP code hasn’t changed so I don’t comment this part, but this is where the most job can be done :stuck_out_tongue: The new graphs for multiple SSID is awesome, I’m really happy to see this. Many thanks!

I think that the icon for Wireless and AP should be different, it’s probably planned.


#15

Just a thing, it’s not the good time to rename ciscowlc to aireos? It will use the true name like it’s in Oxidized as example.


#16

@FTBZ nope, but you can open an issue for that.


#17

Hi, Looks really interesting, unfortunately not much time (none!) for developing at the moment.

One thing I would raise on this is the difference between ‘wifi’ and ‘wireless’ for want of better terminology.

I work at a Wireless ISP, we do fixed last mile wireless, changes are slow and gradual, usually planned and my monitoring interest is mostly in trends. Our gear, some wimax, some ubiquiti, some mikrotik, mostly cambium these days are single radio, single SSID and my clients only associate to one AP and usually remain associated for days or weeks so if a signal varies it usually indicates a problem.

I also do a small amount of ‘wifi’, mostly unifi but some cisco meraki previously (i think they may have rebranded) and am looking at some of the new cambium cnPilot gear. These tend to be in hotels, bars, on campus, etc. They have multiple radios per AP and multiple SSIDs per radio. Clients roam freely and associate / disassociate regularly. On these I am interested in peak client numbers, average client numbers, monitoring varying signals and live performance data. A signal varying here probably just means someone has walked out ofthe room…

I am unsure if these two different usage environments can be easily merged from either a reporting or an alerting viewpoint.

An analogy might be the difference in requirements between monitoring a cubicle farm and a hotdesking suite, or between your server room and your amazon cloud dev ops infrastrastructure.


#18

Easy, in your alert rule, add device.os to restrict alerts to ones you care about. Also, alert thresholds can be set on a per device basis.


#19

I’m very excited to try out all the new stuff with my upcoming UniFi installation. Thanks for taking the time to overhaul the whole wireless code!

Sadly, this PR completely removed the old “wireless clients” graph from my home router (Apple AirPort 7.6.8). Here is an snmpwalk: https://p.libren.ms/view/1c1ff053

I know those consumer home devices weren’t the focus of this PR and NMS in it’s entirety, but I really liked the fact that I could see how many devices where online at each point in time at home, so it would be awesome if this feature could somehow be brought back to life.


#20

@florianbeer I had to re-run discovery after the update to get the new wireless sensors to start to poll, etc. I thought Apple Airport stuff was supported so perhaps a discovery will resurrect the sensors?