Some applications empty on one host

dvl · 28 December 2019 21:38

I am trying to find out why one host is not displaying data for SMART, UPS apcups, and ZFS. I am using these settings in snmpd.conf:

extend smart            /usr/local/etc/snmp/smart
extend ups-apcups       /usr/local/etc/snmp/ups-apcups
extend zfs              /usr/local/etc/snmp/zfs-freebsd

To help debug this issue, how can I see what this host is returning for those applications?

I have already compared working hosts with respect to configuration settings, scripts, and script output. Now I want to see what LibreNMS is seeing and compare this host to working hosts.

I have tried:

/usr/local/bin/snmpbulkwalk -v3 -l authPriv -n "" -a SHA -A '[redacted]' -u roUser -x AES -X '[redacted]' -OUneb -t 10 udp:x8dtu.vpn.unixathome.org:161

This output contains nothing related to SMART, UPS apcups, and ZFS that I can find, so I think this is not the appropriate command for what I want to see.

Suggestions please.

SinisterCrayon · 3 February 2020 19:30

I don’t know if this was your situation… but is this host across a router on another subnet?

I ask because I had this exact problem with ZFS. I have three machines running ZFS and one of them had completely blank ZFS graphs despite successfully discovering the application (yes, I had put in the zfs-linux snmp script and extended properly). Drove me nuts until I finally used

snmpget -v2c -c public -Oqv -m NET-SNMP-EXTENDED-MIB udp:hostname:161 nsExtendedOutputFull.3.122.102.115

I got the response “Timeout: No Response from udp:hostname:161”. As a test I enabled tcp and switched LibreNMS to use TCP for SNMP on that host and voila; data started coming in. I know you’d need a different command, but this is just for reference.

While everything else seemed to be working across UDP (including my NFS statistics), it was specifically the ZFS stats that weren’t coming back across UDP. I don’t really know why offhand but this was an easily solution that hasn’t resulted in any notable issues. I suspect it is something in the zfs-linux script or the system that it just doesn’t like for some reason.

dvl · 3 February 2020 20:06

I did a bunch of testing and experiments over the end of December. I recorded it all on Twitter (it wasn’t something for a blog post).

Start here and scroll to the top, then read down: https://twitter.com/DLangille/status/1212159969463865346

Looking at the LibreNMS SNMP tab for each of my hosts in question: Only one is TCP.

I just raised myself a ticket to play with this another time. Your timeout may be it. I think you’re onto something.

SinisterCrayon · 3 February 2020 20:47

Thanks… interesting stuff. I will say that the time taken for a response from the affected host is longer than I’d expect when I run that same command from a host on the same subnet; something on the order of about 3 seconds before I get data back. But I do think there’s something there with a timeout across my router but not 100% sure where the problem may be coming from.

My network setup is pretty simple; my inter-VLAN router is an L3 switch (Dell N4064)… the LibreNMS host is hung off a simple L2 switch in a different location with a simple gig connection between the two.

It does seem specifically to be related to applications / snmp extensions. All the basic SNMP stuff seems to be coming back without issue.