Auto discovery not working on Lenovo switches

Hi, I set up auto discovery via lldp and while this is working on my HP switches it does not on my Lenovo switches. In the log I just see this:

On HP I get also the ‘Check name lookup’ error but it is added afterwards.

I guess that is any specific related to the Lenovo Switches. Any ideas how to troubleshoot this?

====================================

Component Version
LibreNMS 1.67-45-g78fa53962
DB Schema 2020_08_28_212054_drop_uptime_column_outages (173)
PHP 7.3.19-1~deb10u1
Python 3.7.3
MySQL 10.3.23-MariaDB-0+deb10u1
RRDTool 1.7.1
SNMP NET-SNMP 5.7.3
====================================

[OK] Composer Version: 1.10.13
[OK] Dependencies up-to-date.
[OK] Database connection successful
[OK] Database schema correct
[WARN] Your install is over 24 hours out of date, last update: Fri, 18 Sep 2020 14:49:20 +0000
[FIX]:
Make sure your daily.sh cron is running and run ./daily.sh by hand to see if there are any errors.

Thanks
Martin

It is linked to lack of DNS. If your switch name is not resolved by a DNS, it can’t be added.
You can disable DNS and allow adding by IP address with the “Discovery by IP” config item in config.php:
$config['discovery_by_ip'] = true;

Doc: https://docs.librenms.org/Extensions/Auto-Discovery/

No, sorry. I already have this line in my config.php. In addtion the switches that are auto discovered by my HP switches are also not resolvable by DNS.

May be the Lenovo switches do not provide the IP address correctly via LLDP, which could explain why it fails.

  • You can try to see how the LLDP packet look like from both HP switches and Lenovo switches and compare. You can do that with wireshark and a PC plugged directly to the switch.
  • You can look at the LLDP command/webpage on the switches to see how they see themselves.
  • You can run the LibreNMS discovery in verbose and debug mode as well ( ./discovery.php -h <deviceID> -m discovery-protocols -v -d).

This should give you some hints on what’s going on.

Thanks for the trouble shooting tips.

This is what I see on HP:
SQL[SELECT * FROM ports WHERE port_id = ? [1939] 0.3ms]

array (
1 =>
array (
‘lldpRemChassisIdSubtype’ => ‘4’,
‘lldpRemChassisId’ => ‘F0 92 1C 70 07 00’,
‘lldpRemPortIdSubtype’ => ‘7’,
‘lldpRemPortId’ => ‘10’,
‘lldpRemPortDesc’ => ‘10’,
‘lldpRemSysName’ => ‘SW.37’,
‘lldpRemSysDesc’ => ‘HP J9774A 2530-8G-PoEP Switch, revision YA.16.10.0002, ROM YA.15.20 (/ws/swbuildm/rel_ajanta_qaoff/code/build/lakes(swbuildm_rel_ajanta_qaoff_rel_ajanta))’,
‘lldpRemSysCapSupported’ => ‘20’,
‘lldpRemSysCapEnabled’ => ‘20’,
‘lldpRemManAddr’ => ‘10.255.255.37’,

While this is what I see on Lenovo:
SQL[SELECT * FROM ports WHERE port_id = ? [185] 0.16ms]

array (
9 =>
array (
‘lldpRemChassisIdSubtype’ => ‘4’,
‘lldpRemChassisId’ => ‘D0 67 26 9E B2 00’,
‘lldpRemPortIdSubtype’ => ‘7’,
‘lldpRemPortId’ => ‘27’,
‘lldpRemPortDesc’ => ‘27’,
‘lldpRemSysName’ => ‘SW.63’,
‘lldpRemSysDesc’ => ‘HP J9776A 2530-24G Switch, revision YA.16.10.0002, ROM YA.15.20 (/ws/swbuildm/rel_ajanta_qaoff/code/build/lakes(swbuildm_rel_ajanta_qaoff_rel_ajanta))’,
‘lldpRemSysCapSupported’ => ‘20’,
‘lldpRemSysCapEnabled’ => ‘20’,
‘lldpRemManAddr’ => ‘ffff:3f’,

It looks like the value in ‘lldpRemManAddr’ is not correct on the Lenovo. Does this mean the Lenovo reports back the wrong value?

Difficult to say until you use the tips provided to find out.

When I run ‘sh lldp internal info neighbors’ on the Lenovo I can see the correct IP:

remote_max_frame_size :0
remote_sys_name :SW.63
remote_sys_descr :HP J9776A 2530-24G Switch, revision YA.16.10.0002, ROM YA.15.20 (/ws/swbuildm/r
el_ajanta_qaoff/code/build/lakes(swbuildm_rel_ajanta_qaoff_rel_ajanta))
remote_sys_cap :0x4
remote_sys_cap_enabled :4
remote_mgmt_addr_list :0xaeb220
remote_mgmt_addr :10.255.255.63
remote_mgmt_addr_sub_type :1

Unfortunately I don´t have access to the switches in the moment to do wireshark traces.

Does this help anyway to narrow down the problem?

Try the 3rd tip, the one with the debug and verbose discovery.

Be careful not to disclose any private sensitive information (community will be cleartext for instance)

I think your 3rd tip is what I already posted today:

This is what I see on HP:
SQL[SELECT * FROM ports WHERE port_id = ? [1939] 0.3ms]

array (
1 =>
array (
‘lldpRemChassisIdSubtype’ => ‘4’,
‘lldpRemChassisId’ => ‘F0 92 1C 70 07 00’,
‘lldpRemPortIdSubtype’ => ‘7’,
‘lldpRemPortId’ => ‘10’,
‘lldpRemPortDesc’ => ‘10’,
‘lldpRemSysName’ => ‘SW.37’,
‘lldpRemSysDesc’ => ‘HP J9774A 2530-8G-PoEP Switch, revision YA.16.10.0002, ROM YA.15.20 (/ws/swbuildm/rel_ajanta_qaoff/code/build/lakes(swbuildm_rel_ajanta_qaoff_rel_ajanta))’,
‘lldpRemSysCapSupported’ => ‘20’,
‘lldpRemSysCapEnabled’ => ‘20’,
‘lldpRemManAddr’ => ‘10.255.255.37’,

While this is what I see on Lenovo:
SQL[SELECT * FROM ports WHERE port_id = ? [185] 0.16ms]

array (
9 =>
array (
‘lldpRemChassisIdSubtype’ => ‘4’,
‘lldpRemChassisId’ => ‘D0 67 26 9E B2 00’,
‘lldpRemPortIdSubtype’ => ‘7’,
‘lldpRemPortId’ => ‘27’,
‘lldpRemPortDesc’ => ‘27’,
‘lldpRemSysName’ => ‘SW.63’,
‘lldpRemSysDesc’ => ‘HP J9776A 2530-24G Switch, revision YA.16.10.0002, ROM YA.15.20 (/ws/swbuildm/rel_ajanta_qaoff/code/build/lakes(swbuildm_rel_ajanta_qaoff_rel_ajanta))’,
‘lldpRemSysCapSupported’ => ‘20’,
‘lldpRemSysCapEnabled’ => ‘20’,
‘lldpRemManAddr’ => ‘ffff:3f’,

It looks like the value in ‘lldpRemManAddr’ is not correct on the Lenovo. Does this mean the Lenovo reports back the wrong value?

Or do you want me to post the whole output?

I don’t need SQL but the CLI from ./discovery. This will show the SNMP request, the SNMP reply and how it is interpreted by LibreNMS. Most probably, the reply from the Lenovo is not following the rules, so LibreNMS will need to do something special to undestand it.

I think this is what you are looking for:

SNMP[‘/usr/bin/snmpbulkwalk’ ‘-v3’ ‘-l’ ‘authPriv’ ‘-n’ “” ‘-a’ ‘SHA’ ‘-A’ ‘myAuth’ ‘-u’ ‘myUser’ ‘-x’ ‘AES’ ‘-X’ ‘myPriv’ ‘-OQun’ ‘-m’ ‘LLDP-MIB’ ‘-M’ ‘/opt/librenms/mibs:/opt/librenms/mibs/lenovo’ ‘udp:10.255.255.166:161’ ‘.1.0.8802.1.1.2.1.4.2.1.3’]
.1.0.8802.1.1.2.1.4.2.1.3.44.2.1.6.44.118.138.55.162.0 = ifIndex
.1.0.8802.1.1.2.1.4.2.1.3.48.410065.3.1.10.255.255.167 = ifIndex
.1.0.8802.1.1.2.1.4.2.1.3.48.410069.2.1.10.255.255.167 = ifIndex
.1.0.8802.1.1.2.1.4.2.1.3.176.410020.9.1.10.255.255.63 = ifIndex
.1.0.8802.1.1.2.1.4.2.1.3.176.410047.6.1.172.20.2.237 = ifIndex
.1.0.8802.1.1.2.1.4.2.1.3.176.410048.7.1.172.20.2.237 = ifIndex
.1.0.8802.1.1.2.1.4.2.1.3.176.410057.4.1.10.255.255.164 = ifIndex
.1.0.8802.1.1.2.1.4.2.1.3.176.410061.5.1.10.255.255.165 = ifIndex
.1.0.8802.1.1.2.1.4.2.1.3.177.410018.10.1.10.255.255.62 = ifIndex
.1.0.8802.1.1.2.1.4.2.1.3.177.410057.4.1.10.255.255.164 = ifIndex
.1.0.8802.1.1.2.1.4.2.1.3.177.410061.5.1.10.255.255.165 = ifIndex
.1.0.8802.1.1.2.1.4.2.1.3.178.410023.12.1.10.255.255.51 = ifIndex
.1.0.8802.1.1.2.1.4.2.1.3.178.410024.11.1.10.255.255.52 = ifIndex
.1.0.8802.1.1.2.1.4.2.1.3.179.410025.13.1.10.255.255.53 = ifIndex
.1.0.8802.1.1.2.1.4.2.1.3.180.410057.4.1.10.255.255.164 = ifIndex
.1.0.8802.1.1.2.1.4.2.1.3.180.410061.5.1.10.255.255.165 = ifIndex
.1.0.8802.1.1.2.1.4.2.1.3.186.410040.14.1.10.255.255.59 = ifIndex
.1.0.8802.1.1.2.1.4.2.1.3.189.410039.15.1.10.255.255.65 = ifIndex
.1.0.8802.1.1.2.1.4.2.1.3.212.410044.18.1.10.255.255.66 = ifIndex
.1.0.8802.1.1.2.1.4.2.1.3.997967.410042.19.1.10.255.255.69 = ifIndex
.1.0.8802.1.1.2.1.4.2.1.3.1170401.410043.22.1.10.255.255.67 = ifIndex
.1.0.8802.1.1.2.1.4.2.1.3.1267187.410061.5.1.10.255.255.165 = ifIndex
.1.0.8802.1.1.2.1.4.2.1.3.1267192.410061.5.1.10.255.255.165 = ifIndex
.1.0.8802.1.1.2.1.4.2.1.3.1267199.410061.5.1.10.255.255.165 = ifIndex
.1.0.8802.1.1.2.1.4.2.1.3.1267207.410061.5.1.10.255.255.165 = ifIndex

Or should I post the whole SNMP stuff?

If you attach the walk of “‘.1.0.8802” that should be enough to see what’s going on.

Please use that command (and adapt your auth details)

/usr/bin/snmpbulkwalk -v3 -l authPriv -n ' "" ' -a SHA -A myAuth -u myUser -x AES -X myPriv -OQun -m LLDP-MIB -M /opt/librenms/mibs:/opt/librenms/mibs/lenovo udp:10.255.255.166:161 .1.0.8802 | ./pbin.sh

(First check that the SNMPWALK command works OK, and then pipe the result to the pbin script, and you can then post here the link to the data. )

And if it is not enough, yes, all the discovery will be needed (using the pbin.sh script as well, so you don’t overload this community forum)

Please see here:
https://p.libren.ms/view/03eb01c6
https://p.libren.ms/view/ab35c08b

Thanks

The issue seems to be in this block :

  
SNMP['/usr/bin/snmpbulkwalk' '-v3' '-l' 'authPriv' '-n' "" '-a' 'SHA' '-A' 'myAuth' '-u' 'myUser' '-x' 'AES' '-X' 'myPriv' '-OQun' '-m' 'LLDP-MIB' '-M' '/opt/librenms/mibs:/opt/librenms/mibs/lenovo' 'udp:10.255.255.166:161' '.1.0.8802.1.1.2.1.4.2.1.3']
.1.0.8802.1.1.2.1.4.2.1.3.44.2.1.6.44.118.138.55.162.0 = ifIndex
.1.0.8802.1.1.2.1.4.2.1.3.48.410065.3.1.10.255.255.167 = ifIndex
.1.0.8802.1.1.2.1.4.2.1.3.48.410069.2.1.10.255.255.167 = ifIndex
.1.0.8802.1.1.2.1.4.2.1.3.176.410020.9.1.10.255.255.63 = ifIndex
.1.0.8802.1.1.2.1.4.2.1.3.176.410047.6.1.172.20.2.237 = ifIndex
.1.0.8802.1.1.2.1.4.2.1.3.176.410048.7.1.172.20.2.237 = ifIndex
.1.0.8802.1.1.2.1.4.2.1.3.176.410057.4.1.10.255.255.164 = ifIndex
.1.0.8802.1.1.2.1.4.2.1.3.176.410061.5.1.10.255.255.165 = ifIndex
.1.0.8802.1.1.2.1.4.2.1.3.177.410018.10.1.10.255.255.62 = ifIndex
.1.0.8802.1.1.2.1.4.2.1.3.177.410057.4.1.10.255.255.164 = ifIndex
.1.0.8802.1.1.2.1.4.2.1.3.177.410061.5.1.10.255.255.165 = ifIndex
.1.0.8802.1.1.2.1.4.2.1.3.178.410023.12.1.10.255.255.51 = ifIndex
.1.0.8802.1.1.2.1.4.2.1.3.178.410024.11.1.10.255.255.52 = ifIndex
.1.0.8802.1.1.2.1.4.2.1.3.179.410025.13.1.10.255.255.53 = ifIndex
.1.0.8802.1.1.2.1.4.2.1.3.180.410057.4.1.10.255.255.164 = ifIndex
.1.0.8802.1.1.2.1.4.2.1.3.180.410061.5.1.10.255.255.165 = ifIndex
.1.0.8802.1.1.2.1.4.2.1.3.186.410040.14.1.10.255.255.59 = ifIndex
.1.0.8802.1.1.2.1.4.2.1.3.189.410039.15.1.10.255.255.65 = ifIndex
.1.0.8802.1.1.2.1.4.2.1.3.212.410044.18.1.10.255.255.66 = ifIndex
.1.0.8802.1.1.2.1.4.2.1.3.997967.410042.19.1.10.255.255.69 = ifIndex
.1.0.8802.1.1.2.1.4.2.1.3.1170401.410043.22.1.10.255.255.67 = ifIndex
.1.0.8802.1.1.2.1.4.2.1.3.1267187.410061.5.1.10.255.255.165 = ifIndex
.1.0.8802.1.1.2.1.4.2.1.3.1267192.410061.5.1.10.255.255.165 = ifIndex
.1.0.8802.1.1.2.1.4.2.1.3.1267199.410061.5.1.10.255.255.165 = ifIndex
.1.0.8802.1.1.2.1.4.2.1.3.1267207.410061.5.1.10.255.255.165 = ifIndex  

It does not recognize the OIDs there, which is not normal imho. This block includes the management IP addresses to reach the remote devices, and that would allow the discovery to jump to the next hops.
Most probably the block is malformed and does not follow the MIB definition.

So you think the switch itself (or better said its firmware) is the issue?

probably it does not follow completely the MIB. There is probably a way to hack LibreNMS to handle this wrong firmware behaviour.
I would start by upgrading the switch to the latest firmware, and check if the issue is still there.

Sorry for the delay but I was not able to upgrade the firmware until now. However it seems there was no change afterwards. I posted another discovery: https://p.libren.ms/view/81f1c6db

Is it possible to workaround this and get auto discovery working for those Lenovos?

Thanks
Martin

Hi,

You could try to hack the code, but it would break the next upgrades of LibreNMS, leading to a waste of time for you to get any enhancement into LibreNMS.
The only good solution is to open a ticket at Lenovo support and ask them to correct their code and follow the standards. That way, LibreNMS and any other software following the same standards will just work.

Bye