Juniper loss not polling correctly

Tags: #<Tag:0x00007f84a9085338>

We’ve started using junipers rpm-probe feature to monitor customers links on SRX devices. This works great for us in order to produce icmp graphs under the new SLA tab.

However under Health → Loss, all graphs show 0 even in cases where we have experienced loss on probe.

Digging further into this it seems that the conversion of ascii to decimal is not quite right when polling over SNMP.

On an SRX I have the following configuration:

services {
    rpm {
        probe voip-failover {
            test icmp-test {
                target address xxx.xxx.xxx.xx;
                probe-count 3;
                probe-interval 15;
                test-interval 1;
                thresholds {
                    successive-loss 3;
                    total-loss 3;
                }
                destination-interface ge-0/0/3.0;
            }
        }

Running the snmp walk for the specific MIB both default decimal and ascii

> show snmp mib walk jnxRpmResSumPercentLost
jnxRpmResSumPercentLost.13.118.111.105.112.45.102.97.105.108.111.118.101.114.9.105.99.109.112.45.116.101.115.116.1 = 0
jnxRpmResSumPercentLost.13.118.111.105.112.45.102.97.105.108.111.118.101.114.9.105.99.109.112.45.116.101.115.116.2 = 0
jnxRpmResSumPercentLost.13.118.111.105.112.45.102.97.105.108.111.118.101.114.9.105.99.109.112.45.116.101.115.116.4 = 185147


> show snmp mib walk jnxRpmResSumPercentLost ascii
jnxRpmResSumPercentLost."voip-failover"."icmp-test".1 = 0
jnxRpmResSumPercentLost."voip-failover"."icmp-test".2 = 0
jnxRpmResSumPercentLost."voip-failover"."icmp-test".4 = 185128


Where we see the issue is from the output in LibreNMS:

SQL[SELECT * FROM `sensors` WHERE `sensor_class` = ? AND `device_id` = ? ["loss",78] 0.55ms] 
  

SNMP['/usr/bin/snmpget' '-v2c' '-c' 'COMMUNITY' '-OUQnte' '-M' '/srv/librenms/mibs:/srv/librenms/mibs/junos' 'udp:HOSTNAME:161' '.1.3.6.1.4.1.2636.3.50.1.2.1.35.118.111.105.112.45.102.97.105.108.111.118.101.114.46.105.99.109.112.45.116.101.115.116.46.99.117.114.114.101.110.116.84.101.115.116' '.1.3.6.1.4.1.2636.3.50.1.2.1.41.118.111.105.112.45.102.97.105.108.111.118.101.114.46.105.99.109.112.45.116.101.115.116.46.108.97.115.116.67.111.109.112.108.101.116.101.100.84.101.115.116' '.1.3.6.1.4.1.2636.3.50.1.2.1.32.118.111.105.112.45.102.97.105.108.111.118.101.114.46.105.99.109.112.45.116.101.115.116.46.97.108.108.84.101.115.116.115']

.*.4.1.26*.*.*.*.*.*.*.*.*.*.115.116 = No Such Object available on this agent at this OID
.*.4.1.26*.*.*.*.*.*.*.*.*.*.*.* = No Such Object available on this agent at this OID
.*.4.1.26*.*.*.*.*.*.*.*.*.115.116.115 = No Such Object available on this agent at this OID

  
Checking (snmp) loss voip-failover.icmp-test.currentTest Probe Loss... 
Checking (snmp) loss voip-failover.icmp-test.lastCompletedTest Probe Loss... 
Checking (snmp) loss voip-failover.icmp-test.allTests Probe Loss... 


0 

0 
0 

The ascii to decimal conversion appears to mess this up.

What is .1.3.6.1.4.1.2636.3.50.1.2.1.35.118.111.105.112.45.102.97.105.108.111.118.101.114.46.105.99.109.112.45.116.101.115.116.46.99.117.114.114.101.110.116.84.101.115.116

should be according to the SRX:

.1.3.6.1.4.1.2636.3.50.1.2.1.4.13.118.111.105.112.45.102.97.105.108.111.118.101.114.9.105.99.109.112.45.116.101.115.116.1

From running decimal to ascii conversions I can see that LibreNMS pulls out:
1181111051124510297105108111118101114461059910911245116101115116469911711411410111011684101115116 which is voip-failover.icmp-test.currentTest in ascii.

It seems that when it comes to polling that, the SRX does not like it at all possibly due to the inclusion of currentTest within the OID string.

snmpwalk -v2c -On -c <community> -M /opt/librenms/mibs:/opt/librenms/mibs/junos xxx.xxx.xxx.xx .1.3.6.1.4.1.2636.3.50.1.2.1.35.118.111.105.112.45.102.97.105.108.111.118.101.114.46.105.99.109.112.45.116.101.115.116.46.99.117.114.114.101.110.116.84.101.115.116
.1.3.6.1.4.1.2636.3.50.1.2.1.35.118.111.105.112.45.102.97.105.108.111.118.101.114.46.105.99.109.112.45.116.101.115.116.46.99.117.114.114.101.110.116.84.101.115.116 = No Such Object available on this agent at this OID

That’s as far as I have got. I’ve also tested this against an MX router with same results


./validate.php
====================================
Component | Version
--------- | -------
LibreNMS  | 21.7.0-63-ga7f9c97ae
DB Schema | 2021_25_01_0127_create_isis_adjacencies_table (213)
PHP       | 7.4.3
Python    | 3.8.10
MySQL     | 10.3.30-MariaDB-0ubuntu0.20.04.1
RRDTool   | 1.7.2
SNMP      | NET-SNMP 5.8
====================================

[OK]    Composer Version: 2.1.5
[OK]    Dependencies up-to-date.
[OK]    Database connection successful
[OK]    Database schema correct

Juniper SRX:

> show version
Hostname: xxxx
Model: srx300
Junos: 19.2R2.7
JUNOS Software Release [19.2R2.7]

Just wondering/hoping someone else has come into this issue, or where to go next. It would be nice to have the loss polling in LibreNMS.

Cheers