HPE IRF Monitoring

Hi everyone,
I have several stacks of HPE (Comware) switches (78 stacks, IRF stacking).
Unfortunately, in an IRF configuration, the number of members in the stack is never indicated, which means that when a member leaves the stack (power outage/down/whatever), it’s perceived as a topology change, not a problem.
So I monitor my switches on LNMS, but I’m unable to see when a stack member is missing; that’s a bummer!
There is an OID “1.3.6.1.4.1.25506.2.91.1.2.0” which indicates the number of members currently in the stack:

librenms:/opt/librenms# snmpget -v2c -c mycom x.x.x.x 1.3.6.1.4.1.25506.2.91.1.2.0
SNMPv2-SMI::enterprises.25506.2.91.1.2.0 = INTEGER: 2

I already tried to create an alert rule for a 2-member stack sensors.sensor_current != "2" && sensors.sensor_oid = "1.3.6.1.4.1.25506.2.91.1.2.0"
But unfortunately, when I simulates the disappearance of a switch, no warning in sight.
I also tried playing with a “Custom OID”, but I admit that once I’ve added it, I don’t know what to do with it :sweat_smile:
I know I’ll have to inventory my entire network to indicate a custom value for each stack, but hey, for lack of anything better…
If anyone has a suggestion.
Thanks!

FYI:
lnms snmp:translate comware 1.3.6.1.4.1.25506.2.91.1.2.0
HH3C-STACK-MIB::hh3cStackMemberNum.0 = .1.3.6.1.4.1.25506.2.91.1.2.0

Try adding a count sensor. Health Information - LibreNMS Docs

some people try to add things like this as psuedo state sensors. But I find that odd unless the max members is relatively low.

Hey,

I’ll try, but it’ll take a little time, especially since I’m using Docker, so it’s a bit of a pain.
Is there a simpler temporary solution in the meantime?

Hey @murrant,

Thanks for your reply. It’s better, but I still have a problem.

Here’s my /opt/librenms/resources/definitions/os_discovery/comware.yaml file:

mib: HH3C-LswDEVM-MIB:HH3C-ENTITY-EXT-MIB:HH3C-STACK-MIB
modules:
    os:
        sysDescr_regex: '/Version (?<version>[0-9.]+).*(Release|ESS) (?<features>[R0-9P]+).*[\n ](HPE |HPE FF |HP |H3C )(?<hardware>.*)[\r ][\n ]/'
    sensors:
        pre-cache:
            data:
                -
                    oid:
                        - entPhysicalName
        state:
            data:
                -
                    oid: hh3cdevMFanStatusTable
                    value: hh3cDevMFanStatus
                    num_oid: '.1.3.6.1.4.1.25506.8.35.9.1.1.1.2.{{ $index }}'
                    descr: 'Fan {{ $index }}'
                    index: '{{ $index }}'
                    states:
                        - { value: 1, descr: active, graph: 1, generic: 0 }
                        - { value: 2, descr: deactive, graph: 1, generic: 2 }
                        - { value: 3, descr: not-install, graph: 1, generic: 3 }
                        - { value: 4, descr: unsupport, graph: 1, generic: 1 }
                -
                    oid: hh3cdevMPowerStatusTable
                    value: hh3cDevMPowerStatus
                    num_oid: '.1.3.6.1.4.1.25506.8.35.9.1.2.1.2.{{ $index }}'
                    descr: 'Power Supply {{ $index }}'
                    index: '{{ $index }}'
                    states:
                        - { value: 1, descr: active, graph: 1, generic: 0 }
                        - { value: 2, descr: deactive, graph: 1, generic: 2 }
                        - { value: 3, descr: not-install, graph: 1, generic: 3 }
                        - { value: 4, descr: unsupport, graph: 1, generic: 1 }
        power:
            data:
                -
                    oid: hh3cEntityExtPowerEntry
                    value: hh3cEntityExtCurrentPower
                    num_oid: '.1.3.6.1.4.1.25506.2.6.1.3.1.1.3.{{ $index }}'
                    descr: '{{ $entPhysicalName }} Power Usage'
                    index: 'hh3cEntityExtCurrentPower.{{ $index }}'
                    divisor: 1000
                    skip_values: 0
        count:
            data:
                -
                    oid: hh3cStackMemberNum
                    num_oid: '.1.3.6.1.4.1.25506.2.91.1.2.0'
                    descr: 'Members'
                    index: '0'

Test with CMD ./discovery.php -h 4086 -m sensors:

LibreNMS Discovery
swader1e1.u-pem.fr 4086 comware 
#### Load disco module core ####
OS: HPE Comware (comware)
  

>> Runtime for discovery module 'core': 0.0870 seconds with 79960 bytes
>> SNMP: [1/0.07s] MySQL: [0/0.00s] RRD: [0/0.00s]  
#### Unload disco module core ####


#### Load disco module sensors ####
Caching data: os sensors 
 ENTITY-SENSOR: Caching OIDs: entPhysicalDescr entPhysicalName entPhySensorType entPhySensorScale entPhySensorPrecision entPhySensorValue entPhySensorOperStatus
Airflow: 
Ber: 
Bitrate: 
Charge: 
Chromatic_dispersion: 
Cooling: 
Count: Members: Cur 3, Low: , Low Warn: , Warn: , High:   
+
Current: Comware 
Dbm: Comware 
Delay: 
Eer: 
Fanspeed: 
Frequency: 
Humidity: 
Load: 
Loss: 
Percent: 
Power: 
Power_consumed: 
Power_factor: 
Pressure: 
Quality_factor: 
Runtime: 
Signal: 
Snr: 
State: .........
Temperature: Comware 
Tv_signal: 
Voltage: Comware 
Waterflow: 

>> Runtime for discovery module 'sensors': 11.5400 seconds with 990368 bytes
>> SNMP: [15/11.43s] MySQL: [16/0.18s] RRD: [0/0.00s]  
#### Unload disco module sensors ####

Discovered in 19.745 seconds


SNMP [17/11.56s]: Snmpget[2/0.13s] Snmpwalk[15/11.43s]  
SQL [19/0.21s]: Select[16/0.17s] Insert[2/0.03s] Update[1/0.01s]  
RRD [0/0.00s]:

Seems good, but the graph is NOK:

Also, do you have an idea for rename from “Count” to another text ?

Thanks !

I think I simply forgot to modify the file on the “dispatcher” container.
I’ll see about changing the name.