After daily.sh was run today and pulled commit a64bd45dbc733280dfc8d28e39a1df0a03adbfaf brocade bgp is broken.
Reason for this seems to be matching of BGP4V2 mib (which on brocade supports VRFs) and BGP4 mib (which as commented in commit does not support VRFs).
Because of this all brocade devices had BGP down alerts
My simple fix is to change bgp-peers.inc.php line 101 to bgp4V2PeerRemoteAs.1, which only walks default VRF. I guess the best solution for brocade devices would be to only use BGP4V2 mib, as this will include all BGP sessions, not only default VRF.
When more info is needed, I’m glad to try and help.
Unfortunately I also experienced this issue with several routers alerting about down bgp sessions. In the alerts all the peer states were empty so it’s not being parsed correctly or not fetched correctly.
Theoretically, both MIBs could still be used, but the results for the global routing table (default VRF) should match. And BGP4V2 mib should here only add the non default VRF mibs. This would then result in the complete BGP table.
@martijn-schmidt : Could you join this topic here and give us some feedback ? It seems that the PR (https://github.com/librenms/librenms/pull/10941) has some impact in its current form. Thanx
After deploying the BGP-peers code from PR #10941 on our production LibreNMS instance, it turned out that there are severe scaling issues with the BGP4V2-MIB on the NetIron platform if there are large amounts of sessions (in our case >1000 BGP peers): this results in heightened MP-CPU when polling, think a 25% overall increase.
Moreover, it appears that the router does a
show ip bgp summary on the CLI to retrieve the data for BGP4V2-MIB and then only displays the output from the first “page”. This means the output from all subsequent “pages” is missed.
Unfortunately these two issues weren’t uncovered during my tests in the lab. My apologies for that. I have opened PR #11096 to revert the changes to Brocade BGP peer discovery/polling.
BGP code from this PR is now reverted. @Elias and @robje please do a ./daily.sh AND a ./discovery.php -h “xx” and confirm that the behaviour is back to what it was previously.
Thanks, I ran the daily and will see how it behaves after the next alert check.
Looks still to be funky:
so for all peers the state is empty or couldn’t be parsed.
Seems to only occur on ipv6 peers, the ipv4 peers on the same router are properly detected and the state is in the database normally.
@Elias The old code didn’t support IPv6 at all, that was what the original PR tried to address. Now that the BGP-peers part of the PR has been reverted the poller code for IPv6 sessions no longer exists, that’s why they’re showing as empty.
Once you run a rediscovery for that device with the reverted code the IPv6 sessions should disappear and everything will be back to the way it was. Please tag me in this thread if that is not the case and I’ll help to investigate.
correct, @Elias, you need a rediscovery for all brocade devices, after the ./daily.sh.
Ok good to know, thanks for the info!
Looks good here too. THNX everyone