Juniper EX4600 Switch High CPU Spike Reading

Running LibreNMS version 1.66-27-gfddf2eb

I have noticed that LibreNMS is polling incorrect CPU Processor Load Percentage for Juniper EX4600 Switch. It will fluctuate for normal to an extremely high CPU Percentage upward in 80 to 90%.

I have a Juniper JTAC Case opened and they are comparing LibreNMS Graphs to output of their Juniper CPU/CPU Threads script that is being run every minute. And JTAC is saying the highest spike we had was 68% and no 90% as Libre shows. They also said that snmp process on the Juniper was using 2% CPU and mib2d was at 8% CPU, so snmp polling contributed 10% to the output according to the CPU script being run on the Juniper.

Can you verify the snmp output of the oids that LibreNMS uses?

Hmm I’ve seen similar with an EX series where I get a high cpu alert, but when I check it on the device there was no load showing in the process list at all. Load wise the port polling tends to be quite slow on the junipers for some reason.

No, not on the cli, the SNMP.

What do I need to do? Just to make sure I am understanding you.

I think he want you to run the poller manually, in debug, and check the Processor module output, to see which OIDs are being used by LibreNMS when polling. Then one can confirm whether the OIDs are indeed the correct ones.
./poller.php -h DEVICE_ID -d

1 Like

Here is the output with confidential information removed -

Load poller module processors

Attempting to initialize OS: junos
OS initialized as Generic
SQL[e[1;33mSELECT * FROM processors WHERE device_id=? e[0;33m[36]e[0m 0.31ms]

SNMP[e[0;36m’/usr/bin/snmpget’ ‘-v2c’ ‘-c’ ‘COMMUNITY’ ‘-OUQn’ ‘-M’ ‘/opt/librenms/mibs:/opt/librenms/mibs/junos’ ‘udp:HOSTNAME:161’ ‘.1.3.6.1.4.1.2636.3.1.13.1.8.7.1.0.0’ ‘.1.3.6.1.4.1.2636.3.1.13.1.8.7.2.0.0’ ‘.1.3.6.1.4.1.2636.3.1.13.1.8.9.1.0.0’ '.1.3.6.1.4.1.2636.3.1.13.1.8.9.2.0.0’e[0m]
..4.1.26..0.0 = 67
.
.4.1.26*..0.0 = 35
.
.4.1.26*..0.0 = 43
.
.4.1.26*.*.0.0 = 35

array (
‘.1.3.6.1.4.1.2636.3.1.13.1.8.7.1.0.0’ => ‘67’,
‘.1.3.6.1.4.1.2636.3.1.13.1.8.7.2.0.0’ => ‘35’,
‘.1.3.6.1.4.1.2636.3.1.13.1.8.9.1.0.0’ => ‘43’,
‘.1.3.6.1.4.1.2636.3.1.13.1.8.9.2.0.0’ => ‘35’,
)
67%
RRD[e[0;32mupdate /opt/librenms/rrd//processor-junos-7.1.0.0.rrd N:67e[0m]
SQL[e[1;33mUPDATE processors set processor_usage=? WHERE processor_id = ? e[0;33m[67,15]e[0m 0.24ms]

35%
RRD[e[0;32mupdate /opt/librenms/rrd//processor-junos-7.2.0.0.rrd N:35e[0m]
SQL[e[1;33mUPDATE processors set processor_usage=? WHERE processor_id = ? e[0;33m[35,16]e[0m 0.15ms]

43%
RRD[e[0;32mupdate /opt/librenms/rrd//processor-junos-9.1.0.0.rrd N:43e[0m]
SQL[e[1;33mUPDATE processors set processor_usage=? WHERE processor_id = ? e[0;33m[43,17]e[0m 0.13ms]

35%
RRD[e[0;32mupdate /opt/librenms/rrd//processor-junos-9.2.0.0.rrd N:35e[0m]
SQL[e[1;33mUPDATE processors set processor_usage=? WHERE processor_id = ? e[0;33m[35,18]e[0m 0.16ms]

Runtime for poller module ‘processors’: 0.0221 seconds with 59984 bytes
SNMP: [1/0.02s] MySQL: [5/0.00s] RRD: [5/0.00s]

Unload poller module processors

RRD[e[0;32mupdate /opt/librenms/rrd//poller-perf-processors.rrd N:0.022095918655396e[0m]
Modules status: Global+
OS
Device

Have a look here.
What do you get if you change

From the EX system itself, using CLI command as below:
[email protected]> show snmp mib walk jnxOperatingDescr

on your system to maybe try:

From the EX system itself, using CLI command as below:
[email protected]> show snmp mib walk .1.3.6.1.4.1.2636.3.1.13.1.8.7.1.0.0

and see what the output is? I reckon that should indeed be the same as the value displayed in LibreNMS.

Output -
show snmp mib walk jnxOperatingDescr
jnxOperatingDescr.1.1.0.0
jnxOperatingDescr.2.1.1.0 = Power Supply 0 @ 0/0/*
jnxOperatingDescr.2.1.2.0 = Power Supply 1 @ 0/1/*
jnxOperatingDescr.2.2.1.0 = Power Supply 0 @ 1/0/*
jnxOperatingDescr.2.2.2.0 = Power Supply 1 @ 1/1/*
jnxOperatingDescr.4.1.1.0 = Fan Tray 0 @ 0/0/*
jnxOperatingDescr.4.1.2.0 = Fan Tray 1 @ 0/1/*
jnxOperatingDescr.4.1.3.0 = Fan Tray 2 @ 0/2/*
jnxOperatingDescr.4.1.4.0 = Fan Tray 3 @ 0/3/*
jnxOperatingDescr.4.1.5.0 = Fan Tray 4 @ 0/4/*
jnxOperatingDescr.4.2.1.0 = Fan Tray 0 @ 1/0/*
jnxOperatingDescr.4.2.2.0 = Fan Tray 1 @ 1/1/*
jnxOperatingDescr.4.2.3.0 = Fan Tray 2 @ 1/2/*
jnxOperatingDescr.4.2.4.0 = Fan Tray 3 @ 1/3/*
jnxOperatingDescr.4.2.5.0 = Fan Tray 4 @ 1/4/*
jnxOperatingDescr.7.1.0.0 = FPC: EX4600-40F @ 0//
jnxOperatingDescr.7.2.0.0 = FPC: EX4600-40F @ 1//
jnxOperatingDescr.8.1.1.0 = PIC: 24x10G-4x40G @ 0/0/*
jnxOperatingDescr.8.2.1.0 = PIC: 24x10G-4x40G @ 1/0/*
jnxOperatingDescr.9.1.0.0 = Routing Engine 0
jnxOperatingDescr.9.2.0.0 = Routing Engine 1

{master:0}
show snmp mib walk .1.3.6.1.4.1.2636.3.1.13.1.8.7.1.0.0

Can you tell me if LibreNMS is pulling aggregate information?

@murrant LibreNMS is definitely not showing the correct CPU Usage.

I connected a test EX4600 switch to my lab and added it to LibreNMS. LibreNMS shows that CPU Usage nearly got to 60% usage and right now it shows to be at 29%.

However, in command line on the Juniper, I am running the following command to get real time CUP Usage and piping to refresh it every 2 seconds - “show snmp mib get 1.3.6.1.4.1.2636.3.1.13.1.8.9.1.0.0 | refresh 2”

And command line on Juniper shows the max it gets to in real time is 17%.