All of our single member (non virtual chassis) QFX5100/5110’s have been alerting for routing engine down. It has a “Routing Engine 0” sensor and a “Routing Engine” sensor. The “Routing Engine” sensor is marked as unknown for some reason. It looks like all of the virtual chassis QFX’s have a “Routing Engine 0” and “Routing Engine 1” so I think that’s why they’re not affected. I attached a screenshot showing that.
It does look like “Routing Engine” is a real thing at least on the QFX because I can see it when doing show snmp mib walk commands:
It looks like this was previously ignored in the old code. We’ve just gone ahead and turned off the check for that under edit, but I’m wondering if there is a more permanent fix.
Although I agree that based on the MIB, the implementation is good:
jnxFruState OBJECT-TYPE
SYNTAX INTEGER {
unknown(1),
empty(2),
present(3),
ready(4),
announceOnline(5),
online(6),
anounceOffline(7),
offline(8),
diagnostic(9),
standby(10)
}
MAX-ACCESS read-only
STATUS current
DESCRIPTION
"The current state for this subject."
::= { jnxFruEntry 8 }
It seems like it’s on the Juniper side that they implemented this in a bad way yet again… However, based on that fact, it would be interesting not to have a false positive error because of it
I’ve opened a case with Juniper to try to understand the reason behind this plain “Routing Engine” that doesn’t even show up in the inventory, as well as get their official reasoning behind the fact that it’s reported as “unknown”…
Hopefully I’ll have something meaningful that could best direct the possible next steps.
Thanks
Update:
If there are curious minds, it’s still pending, after escalations, with a PR (problem report) having been created. Hopefully I’ll have some news soon
jnxFruSlot should be queried to get the valid slot numbers in jnxFruTable - Juniper have this documented for the MX and EX9600, and I’ve found it true for the QFX and EX’s I have (slot number is negative for invalid RE’s).
The old code worked because it filtered out known bad entries - a missing RE will always be called “Routing Engine”, but a present one will be called “Routing Engine X”.
Thanks for your reply and sorry for not seeing it earlier. But indeed, what you have explained is exactly what Juniper has just finally confirmed to me:
So indeed, basically, if it’s VC capable but isn’t set up in a VC, it’ll show an “unnumbered” and “unknown” RE.
As you also mentioned, we could indeed fix this by validating jnxFruSlot during the discovery process, since it would not give a positive value, as per the example below, taken from a QFX5100:
At least there’s a chance that they will clarify this on their end!
When I asked the following:
Could it be considered “safe” and consistent to do a validation of the “unnumbered” RE by looking at its jnxFruSlot value and that if it’s “-1”, it’s safe to discard it?
This is what they answered:
-1 is defined as null which can be equivalent to unknown for snmp walks. In the given scenario that there is no physical RE installed (or linecard/backup), it’s safe to assume that -1 also means unknown as it does for the QFX.
And they added this in regards to the overall information about the situation:
I’m not finding a KB on this so I will write one up – this will take a a month or two for it to be approved but there will eventually be a KB on this. I’ve also requested that this is added to the pathfinder mib walk finder.