Alerts not Triggering correctly

We had some issues with the upgrade to 1.44 and new Trasport setup. We got the upgrade working but since that point we have not been getting emails from the new Trasnport system.

This was previously working fine. The last time we received an email from the system was in Aug.

The point at which the email stopped was as follows;

The Auto update failed on Sat 18/08/2018 17:42

The following email was sent from the server

Blockquote
We just attempted to update your install but failed. The information below should help you fix this.
warning: unable to access ‘/root/.config/git/ignore’: Permission denied
error: Your local changes to the following files would be overwritten by
merge:
html/includes/common/top-devices.inc.php
html/includes/common/top-interfaces.inc.php
html/includes/common/worldmap.inc.php
Please, commit your changes or stash them before you can merge.
Aborting

Blockquote
We just attempted to update your install but failed. The information below should help you fix this.
warning: unable to access ‘/root/.config/git/attributes’: Permission denied
warning: unable to access ‘/root/.config/git/ignore’: Permission denied
error: Your local changes to the following files would be overwritten by
merge:
includes/definitions.inc.php
Please, commit your changes or stash them before you can merge.
Aborting

I have checked through the syslog, mail.log, dmesg, mysql/error.log and can’t see any errors on there.

Testing the trasnport sends an email. If I test the transport as follows;

https://docs.librenms.org/Alerting/Testing/#transports

This also works.

However, live Alerts do not transport.
I have changed one of my rules so it triggers very easily.


image.png2195x500 58.5 KB

However, If I check Recent Events for each one of those hosts I don’t see “Issued critical alert for rule”
The bottom host shows the following in recent events.


image.png2722x704 88.4 KB

root@netmon00:/opt/librenms# ./validate.php

Component Version
LibreNMS 1.46-4-g2061d74
DB Schema 273
PHP 7.0.32-0ubuntu0.16.04.1
MySQL 10.0.36-MariaDB-0ubuntu0.16.04.1
RRDTool 1.5.5
SNMP NET-SNMP 5.7.3

====================================

[OK] Composer Version: 1.8.0
[OK] Dependencies up-to-date.
[OK] Database connection successful
[OK] Database schema correct
[FAIL] Some devices have not completed their polling run in 5 minutes, this will create gaps in data.

It looks like you have several permissions errors and modified files.

You need to fix both… Also looks like you only pasted some of the validate output that doesn’t help you get things fixed.

Hi Murrant,

Here is the full output. I had to change the permissions on the files;

$ ./validate.php

Component Version
LibreNMS 1.46-48-g43e967f
DB Schema 273
PHP 7.0.32-0ubuntu0.16.04.1
MySQL 10.0.36-MariaDB-0ubuntu0.16.04.1
RRDTool 1.5.5
SNMP NET-SNMP 5.7.3

====================================

[OK] Composer Version: 1.8.0
[OK] Dependencies up-to-date.
[OK] Database connection successful
[OK] Database schema correct
[FAIL] Some devices have not completed their polling run in 5 minutes, this will create gaps in data.
[FIX]:
Check your poll log and see: http://docs.librenms.org/Support/Performance/
Devices:
asr-hq-ex3400-01
[FAIL] We have found some files that are owned by a different user than librenms, this will stop you updating automatically and / or rrd files being updated causing graphs to fail.
[FIX]:
sudo chown -R librenms:librenms /opt/librenms
sudo setfacl -d -m g::rwx /opt/librenms/rrd /opt/librenms/logs /opt/librenms/bootstrap/cache/ /opt/librenms/storage/
sudo chmod -R ug=rwX /opt/librenms/rrd /opt/librenms/logs /opt/librenms/bootstrap/cache/ /opt/librenms/storage/
Files:
/opt/librenms/rrd/lyncsbc.ipperf.local/sensor-state-acAnalogFxsFxoHookState-hookStatus6.rrd
/opt/librenms/rrd/lyncsbc.ipperf.local/sensor-state-acAnalogFxsFxoHookState-hookStatus2.rrd
/opt/librenms/rrd/lyncsbc.ipperf.local/sensor-state-acAnalogFxsFxoHookState-hookStatus4.rrd
/opt/librenms/rrd/lyncsbc.ipperf.local/sensor-state-acAnalogFxsFxoHookState-hookStatus0.rrd
/opt/librenms/rrd/lyncsbc.ipperf.local/sensor-state-acAnalogFxsFxoHookState-hookStatus3.rrd
/opt/librenms/rrd/lyncsbc.ipperf.local/sensor-state-acAnalogFxsFxoHookState-hookStatus1.rrd
/opt/librenms/rrd/lyncsbc.ipperf.local/sensor-state-acAnalogFxsFxoHookState-hookStatus5.rrd
/opt/librenms/rrd/lyncsbc.ipperf.local/sensor-state-acAnalogFxsFxoHookState-hookStatus7.rrd
/opt/librenms/rrd/netmon-agent00.brs.ipperf.local/ucd_diskio-sr0.rrd
you have mail
$ ^C
$ sudo chown -R librenms:librenms /opt/librenms
[sudo] password for librenms:
librenms is not in the sudoers file. This incident will be reported.

$ ./validate.php

Component Version
LibreNMS 1.46-48-g43e967f
DB Schema 273
PHP 7.0.32-0ubuntu0.16.04.1
MySQL 10.0.36-MariaDB-0ubuntu0.16.04.1
RRDTool 1.5.5
SNMP NET-SNMP 5.7.3

====================================

[OK] Composer Version: 1.8.0
[OK] Dependencies up-to-date.
[OK] Database connection successful
[OK] Database schema correct
[FAIL] Some devices have not completed their polling run in 5 minutes, this will create gaps in data.
[FIX]:
Check your poll log and see: http://docs.librenms.org/Support/Performance/
Devices:
asr-hq-ex3400-01

Regards,

Duncan

The files mentioned in the autoupdate email were previously fixed;

$ ls -ail html/includes/common/top-devices.inc.ph
ls: cannot access ‘html/includes/common/top-devices.inc.ph’: No such file or directory
$ ls -ail html/includes/common/top-devices.inc.php
786715 -rw-rw-r-- 1 librenms librenms 16897 Sep 12 00:15 html/includes/common/top-devices.inc.php
$ ls -ail html/includes/common/top-interfaces.inc.php
786773 -rw-rw-r-- 1 librenms librenms 6148 Sep 12 00:15 html/includes/common/top-interfaces.inc.php
$ ls -ail html/includes/common/worldmap.inc.php
788330 -rw-rw-r-- 1 librenms librenms 10120 Dec 1 00:15 html/includes/common/worldmap.inc.php
$ ls -ail includes/definitions.inc.php
787467 -rw-r–r-- 1 librenms librenms 41130 Oct 24 09:37 includes/definitions.inc.php
$

Do I need to do something else to these files?
The server is now Auto updating. I mention the autoupdate problems as this happened at the same time as the emails not sending.

Regards,

Duncan

The other files don’t exist

root@netmon00:~# ls -ail /root/.config/git/ignore
ls: cannot access ‘/root/.config/git/ignore’: No such file or directory
root@netmon00:~# ls -ail /root/.config/git/attributes
ls: cannot access ‘/root/.config/git/attributes’: No such file or directory

Hi All,

This turned out to be a legacy template issue.

I have now looked at the templates, there were 3 that were legacy, Two of the templates could be converted and then updated. However, one threw an error when it was converted and updated.

We have now deleted all the templates and they have re-created themselves.

We are now receiving Alerts via email and teams again.

Thanks for your input.

Regards,

Duncan