Wrong Billing Record [in_delta]

Hi There,

I am having few difficulties with libreNMS and I am desperate for help.

  1. Its been multiple times that I have evidenced a wrong record in the billing DB (Photos attached) I noticed one spike in usage which seems unreliable. Digging into the traffic graphs, records and DB it was found that the record is from one specific polling cycle (5min). The strange part for me is that we still have a legacy Observium still polling and it does not get that value. I checked the code and it seems the in_delta variable is getting through three different conditionals which I assume is the one that involves an actual snmp_get()to the device and later rests the previous polling value to calculate the actual delta → $port_data['in_delta'] = ($port_data['in_measurement'] - $port_data['last_in_measurement']);. that leads me to think is the actual value polled by from the device…but have not been able to prove it… I just got lost at that stage and I have not been able to find out the root cause and solution. Do you guys have any suggestion? How can I trace that polling cycle? which log would be relevant to check? any ideas??.

  2. Other issue I noticed is that the billing cycle, even thought the timestamps are fine. The GUI graphs does not refresh/rollover on the first of the month but it does on the second. How can I change that?. For some reason that is messing with a customize script in production which poll from LibreNMS billing data…that only happens on the first of each month.

2019-05-31%2015_51_18-LibreNMS_librenms_bill_data-



I hope you can shed some light on this.

Thanks heaps!

Can you tell is the billing code accounting for rolled SNMP counters? (Google that if you don’t know what it is)

Hi Tony,

I am not sure what I am looking for, tried looking for ‘billing code accounting for rolled SNMP counters’ but I can’t find any specific info related to the billing code accounting.

The device is using 64-bit counters and I am assuming this is the one that is being used. Not sure how could I check that.

Could also be the fault of the device returning incorrect data.

I think there is a setting to set what your billing cycle is.

The device is a pair of FGTs in HA and the polling is made pointing to the loopback address of the HA. It has multiple vdoms configured, and this specific service is the only one having that issue so far… have not find a pattern yet, it have happened maybe three times among the last two months. I checked the syslogs from the FGTs and there was nothing unusual, no events were found related to a fail-over or something. We thought it could be because the polling was pointing to the HA interface and maybe the counters could get messed up if there was a fail-over but I also got other services getting info from the same interface and have not had that issue, also there was no fail-over by the time the data was collected or even the whole day.

Hi,

We have completed a debug of the ‘poll-billing.php’. It has been found incorrect data received through the SNMP polling but also a case not considered in the code which leads to write the wrong data in the DB.

The device being polled is a Fortigate 1000C which has multiple links and normally 2 (Virtual Links) are presenting this problem. There is also an Observium box doing the polling and it does consider this scenario, no wrong values are found in that box. Also, there is no an easy way to check this on the FGT, it could be easier and quicker to consider this case in the code?

We have found though there is a case where the ‘polling-billing.php’ does not consider and perhaps assumes a counter wrapped. Anyhow, I wonder if there is somebody with expertise on this billing module which could assist us to review this scenario.

For reference, this is a record of the variables states during 6 polling cycles. It can be seen the wrong value received in polling 3 but actually reflected in the DB on the next polling.

1.Wrong data is received in $port_data['in_measurement'] and $port_data[‘out_measurement’]
2.This case matches the third conditional (Line 76 - source code).The Delta, in this case is not calculated and instead it is used the previous values $port_data['in_delta'] = $port_data['last_in_delta'];
3.DB is updated → Counters from bill_port_counters are updated with the wrong in_counter.
4. On the next cycle a legit value is received and matches second conditional (Line 74 - source code).
5. It uses the previous value (Which is wrong) to calculate the delta → There is where the crazy value is written in the DB.

I believe this case could be added in some way in the code to avoid the DB update with the wrong value.

Would it be possible to review this case?.Your assistance is much appreciated.

Hi Guys, I would like to hear thoughts on this outcome if possible.

After digging in the source code, it was found the follow scenarios:

Scenarios encountered (No1 resolved - 2 and 3 to discuss)

  1. If both SNMP polling queries fail (64/32) value returned is nor numeric --> is assumed as 0 --> this is overcome on the first calculation but the value is stored as a legit counter. The problem arises on the next calculation where the delta will be calculated against the 0 value… high delta as result.
    Solution:
    If both queries failed (64b and 32b) the value returned is FALSE. In this case conditional needs to be created in poll-billing.php to evaluate this condition and avoid DB update on ‘bill_port_counters’ table.

2.SNMP function used getValue() and snmp_get() uses RFC-2665 starting querying for 64counter and querying 32bit if the 64 fails --> Case No1 --> this is a problem when the current counter is over 2^32 (max counter value), in this case the counter (32bit) returned is not relevant…calculation would then give a big Delta.
Comment:
Dependent on 3 to add conditional and determine which query to use. --> In our case we use 64b counters only, we commented 32b query for the time being (function getValue())? --> Added a debug in this scenario to query the 64b again and try to capture the error returned by the snmp query --> No success.

3.Poll-billing.php --> Delta calculation does not discriminate if value obtained from counter was 32 or 64 --> this is a problem when the previous counter value (variable retained in the DB is over 2^32), in this case the counter (32bit) returned is not relevant…calculation would then give a big Delta.
Comment:
Modify getValue() function to receive port counter (64 or 32) and poll that one only. However, there is not seem to be currently a value on the tables that flag this case. This could be taken when the adding process is completed maybe?