Monitoring proxmox, reboot proxmox nodes, fans and voltages wonky till rediscovery

When asking for help and support, please provide as much information as possible. This should include:

  • Steps to reproduce an issue.
  • The output of ./validate.php

I’ll cover the steps below the validate output:

==========================================================

Component Version
LibreNMS 2ba7093d9caaf3627a721df00a31b667525f6804
DB Schema 199
PHP 5.5.9-1ubuntu4.21
MySQL 5.5.55-0ubuntu0.14.04.1
RRDTool 1.4.7
SNMP NET-SNMP 5.7.2

==========================================================

[WARN] Your install is out of date, last update: Tue, 04 Jul 2017 21:41:22 +0000
[OK] Database connection successful
[OK] Database schema correct

First, it hasn’t updated in the last day because it’s been offline while I work on underlying infrastructure upgrades that are unrelated. It has been successfully upgrading for months now.

Second, the issue…

I have LibreNMS monitoring two proxmox nodes that are in a cluster. Sometimes I need to reboot a node for updates, or other reasons. When I reboot one of these nodes, LibreNMS starts getting wonky info about fan RPM and voltage readings for the node. This is corrected by rediscovering the device. This seems to happen 100% of the times when rebooting the node, and rediscover seems to make the values line up correctly.

So for example, I would get the +3.3v info in the +5v field, and others are misaligned similarly too.

After I rediscover the node I have to reset the low/high values each time too, this seems to get reset for some reason too.

This issue has been going on for many months, and has spanned multiple proxmox upgrades including 4.3, 4.4, and now 5.0.

I do not see any indication that this is a proxmox-side issue.

I am unsure when this issue actually started.

The nodes are interfaced via snmp v3.

As per this morning, still need the output of:

SELECT * FROM sensors WHERE device_id=X\G

If you are saying this happens when you do a rediscovery, then run one and provide a second output of that command.

Please pastebin all results.

Ack! I remember you telling me this now. But I’d like a bit of clarification if you could.

  1. Do you want me to reboot the node and put it in the “bad” state again, then run this query, pastebin that, then run rediscover, re-align the lows/highs, re-run the query, pastebin that in another pastebin, and link that too?
  2. Where it says “device_id=X\G”, do I need to adjust the “X\G” to something specific to my device, or is that explicitly the query you want me to use?
  3. This looks like a SQL query, is that correct? And you want me to run it on the LibreNMS host for this info?

Sorry about losing track of this part, I know you asked me last night about it, but I forgot. :frowning:

I do need this clarification though, to make sure I’m giving you accurate data.

Thanks for your patience! :slight_smile:

For those reading, laf clarified on IRC:

  1. I need to run the query in the bad and good state, pastebin each.
  2. I need to replace “X” with the device ID, but keep the “\G” too.
  3. It is a SQL query.

I’ll handle this shortly! :slight_smile:

Here they are:

  1. Before : https://pastebin.com/EfsJV8eK
  2. After : https://pastebin.com/YgVfH6mW

I want to clarify on a few things.

  1. After I rediscovered the device, the sensors had low/high thresholds that were indicative of the “wrong” values (for example, 3.3v had low/highs for 5v). So I manually adjusted them. The posted after pastebin is what is shown AFTER I MANUALLY ADJUSTED THE LOW/HIGH THRESHOLDS TO BE CORRECT. This includes the fan RPM thresholds.
  2. Thus far this happens every time (the issue).
  3. Before is when the settings are incorrect after the reboot, BEFORE I do the rediscovery.
  4. Generally After is what I would expect to be the correct values.

Awesome the index changes, so even if a discovery fixes the label, the historical data will be all jumbled.

Yup :frowning:

@BloodyIron I don’t think we can fix this. lmsensors / net-snmp is changing the indexes associated with each sensor so we are continuously going to rediscovery upon each restart

Perhaps I should bridge this thread with the proxmox forums so I can convince them to chime in or something? : https://forum.proxmox.com/threads/snmp-indexing-shifts-on-reboot-librenms.35527/