Dispatcher Alerts Not Working

Probably missing something simple, but I’m stumped at the moment.
Alerts show triggered but emails not sending with dispatcher service, but work with Legacy cron.
Setup consists of 2 servers in distributed poller mode. First server for Web/Poller/MySQL/RRDCached/Oxidized/Weathermap/Nagios/Redis/Smokeping, Second server configured as poller/nagios.
All servers migrated from Debian 10 to Ubuntu 24.04.1 LTS. MySQL DB and RRD files copied over via Librenms docs.
Migrated config.php options to config:set/web gui.

Scheduled tasks set to Dispatcher Service in Web Gui.

Added following to config.php file on both servers as indicated via Web Gui>Settings>Scheduled Task>Help Balloon (not sure if needed on distributed poller but added in case):
$config[‘service_alerting_enabled’] = true;
$config[‘service_billing_enabled’] = true;
$config[‘service_discovery_enabled’] = true;
$config[‘service_ping_enabled’] = true;
$config[‘service_services_enabled’] = true;
$config[‘service_watchdog_enabled’] = true;
image

Cron lines have be commented out.

Went back through Dispatcher docs and double checked everything.
Dispatcher Service (RC) - LibreNMS Docs

Did notice I receive an error when running the command:
pip3 install -r requirements.txt

image

Could this be the cause?

Validate output of Web server. Disregard smokeping errors.

Component Version
LibreNMS 24.11.0-39-g92a822d27 (2024-12-03T04:55:49-06:00)
DB Schema 2024_11_22_135845_alert_log_refactor_indexes (308)
PHP 8.3.14
Python 3.12.3
Database MariaDB 10.11.8-MariaDB-0ubuntu0.24.04.1
RRDTool 1.7.2
SNMP 5.9.4.pre2
===========================================

[OK] Composer Version: 2.8.3
[OK] Dependencies up-to-date.
[OK] Database connection successful
[OK] Database connection successful
[OK] Database Schema is current
[OK] SQL Server meets minimum requirements
[OK] lower_case_table_names is enabled
[OK] MySQL engine is optimal
[OK] Database and column collations are correct
[OK] Database schema correct
[OK] MySQL and PHP time match
[OK] Distributed Polling setting is enabled globally
[OK] Connected to rrdcached
[OK] Active pollers found
[OK] Dispatcher Service is enabled
[OK] Locks are functional
[OK] Python wrapper cron entry is not present
[OK] Redis is functional
[OK] rrdtool version ok
[OK] Connected to rrdcached
[FAIL] We have found some files that are owned by a different user than ‘librenms’, this will stop you updating automatically and / or rrd files being updated causing graphs to fail.
[FIX]:
sudo chown -R librenms:librenms /opt/librenms
sudo setfacl -d -m g::rwx /opt/librenms/rrd /opt/librenms/logs /opt/librenms/bootstrap/cache/ /opt/librenms/storage/
sudo chmod -R ug=rwX /opt/librenms/rrd /opt/librenms/logs /opt/librenms/bootstrap/cache/ /opt/librenms/storage/
Files:
/opt/librenms/rrd/smokeping/__cgi
/opt/librenms/rrd/smokeping/power
/opt/librenms/rrd/smokeping/power/Sikeston-Rectifier.rrd
/opt/librenms/rrd/smokeping/power/Matthews-Rectifier.rrd
/opt/librenms/rrd/smokeping/power/Miner-Rectifier.rrd
/opt/librenms/rrd/smokeping/power/Morehouse-Rectifier.rrd
/opt/librenms/rrd/smokeping/power/Advance-Rectifier.rrd
/opt/librenms/rrd/smokeping/power/Delta-Rectifier.rrd
/opt/librenms/rrd/smokeping/power/Anniston-Rectifier.rrd
/opt/librenms/rrd/smokeping/power/Bloomfield-Rectifier.rrd
/opt/librenms/rrd/smokeping/power/Tilsit-Rectifier.rrd
/opt/librenms/rrd/smokeping/power/Vanduser-Rectifier.rrd
/opt/librenms/rrd/smokeping/power/Ardeola-Rectifier.rrd
/opt/librenms/rrd/smokeping/power/Idalia-Rectifier.rrd
/opt/librenms/rrd/smokeping/power/Lilbourne-Rectifier.rrd
and 230 more…

Validate output Distributed Poller:

Component Version
LibreNMS 24.11.0-39-g92a822d27 (2024-12-03T04:55:49-06:00)
DB Schema 2024_11_22_135845_alert_log_refactor_indexes (308)
PHP 8.3.6
Python 3.12.3
Database MariaDB 10.11.8-MariaDB-0ubuntu0.24.04.1
RRDTool 1.7.2
SNMP 5.9.4.pre2
===========================================

[OK] Composer Version: 2.8.3
[OK] Dependencies up-to-date.
[OK] Database connection successful
[OK] Database connection successful
[OK] Database Schema is current
[OK] SQL Server meets minimum requirements
[OK] lower_case_table_names is enabled
[OK] MySQL engine is optimal
[OK] Database and column collations are correct
[OK] Database schema correct
[OK] MySQL and PHP time match
[OK] Distributed Polling setting is enabled globally
[OK] Connected to rrdcached
[OK] Active pollers found
[OK] Dispatcher Service is enabled
[OK] Locks are functional
[OK] Python wrapper cron entry is not present
[OK] Redis is functional
[OK] rrdtool version ok
[OK] Connected to rrdcached

I misunderstood the Help Balloons.
Ended up swapping to Legacy in the Web Gui while leaving the config.php options enabled:

$config[‘service_alerting_enabled’] = true;
$config[‘service_billing_enabled’] = true;
$config[‘service_discovery_enabled’] = true;
$config[‘service_ping_enabled’] = true;
$config[‘service_services_enabled’] = true;
$config[‘service_watchdog_enabled’] = true;

Everything is working now.
Kind of interesting that selecting Dispatcher doesn’t trigger email alerts.

Check the librenms.log to see if anything is being recorded there.

@laf not seeing anything, but looks to be info level only. Currently trying to find how to up to debug level logging.

Edit: Found debugging. Waiting for next poll to go through to see what logs say.

Believe I might have found the issue. Assuming this means the global setting is set to Disabled?

Dec 9 10:57:05 nms librenms-service.py[804711]: Poller_0-23(DEBUG):Disable alerting is set, Clearing active alerts and skipping alert rules check

Which is interesting because the WEB UI is set to enabled.

Took a step backwards.
Commented out all of the config.php lines for all of the services on both servers.

#$config['service_<INSERT SERVICE NAME>_enabled'] = true;

Double checked all the cron.d/librenms jobs are commented out on both servers. 
#33   */6  * * *   librenms    /opt/librenms/cronic /opt/librenms/discovery-wrapper.py 1
#*/5  *    * * *   librenms    /opt/librenms/discovery.php -h new >> /dev/null 2>&1

#*/5  *    * * *   librenms    /opt/librenms/cronic /opt/librenms/poller-wrapper.py 16
#*    *    * * *   librenms    /opt/librenms/alerts.php >> /dev/null 2>&1

#*/5  *    * * *   librenms    /opt/librenms/poll-billing.php >> /dev/null 2>&1
#01   *    * * *   librenms    /opt/librenms/billing-calculate.php >> /dev/null 2>&1
#*/5  *    * * *   librenms    /opt/librenms/check-services.php >> /dev/null 2>&1

# Daily maintenance script. DO NOT DISABLE!
# If you want to modify updates:
#  Switch to monthly stable release: https://docs.librenms.org/General/Releases/
#  Disable updates: https://docs.librenms.org/General/Updating/
#19   0    * * *   librenms    /opt/librenms/daily.sh >> /dev/null 2>&1


# Nagios Plugins
#*/5 * * * * librenms /opt/librenms/services-wrapper.py 1


# Weathermap Plugin
#*/5 * * * * librenms /opt/librenms/html/plugins/Weathermap/map-poller.php >> /dev/null 2>&1

When scheduled tasks are set to use Dispatcher service I receive the following error:
image

When change to Legacy (Unrestricted) scheduled tasks, everything works fine.

librenms@nms:/root$ /opt/librenms/check-services.php -d
DEBUG!
Starting service polling run:

SQL[SELECT D.*,S.*,attrib_value  FROM `devices` AS D INNER JOIN `services` AS S ON S.device_id = D.device_id AND D.disabled = 0  LEFT JOIN `devices_attribs` as A ON D.device_id = A.device_id AND A.attrib_type = "override_icmp_disable" ORDER by D.device_id DESC; [] 1.83ms]

(REMOVED FOR BREVITY)

/opt/librenms/check-services.php 2024-12-09 16:22:18 - 14 services polled in 12.57 secs

That’s for the specific device that is being polled at that point, check it as it’s probably got alerting disabled

Ok thanks for the clarification. There’s a few devices that have alerting disabled.
Still unsure what is going on with this whole dispatcher alerting service issue.
Going to leave set to Legacy and enable the dispatcher config.php settings enabled until I can figure out what is going on.

We are having same problem, with alerts run via dispatcher service, email transports are not working but alerts shown on web GUI. If I manually run “./alerts.php -d -f” on CLI, emails send successfully.

Ok, with the hope of catching something I tried to get debug logs of “alerts.php” so I modified “queuemanager.py” and added fixed “-d” parameter to the args:

 def do_work(self, device_id, group):
        logger.info("Checking alerts")
        args = ("-d", "-f") if self.config.debug else ("-d","-f")

And restarted librenms service. After the first run of alert dispacher service, I get email notification then recovery notification as excepted. So what does “-d” parameter do? I thougth its just for debugging.

After further testing I think I found the culprit. It is not about “-d” parameter but its about single or double parameters. With the current code only passing single argument in pyhton needs an extra “,” so it is processed as a tuple. Alert transport works with dispacther service if I apply this change:

diff --git a/LibreNMS/queuemanager.py b/LibreNMS/queuemanager.py
index 5636f7b6d..73b3dcc95 100644
--- a/LibreNMS/queuemanager.py
+++ b/LibreNMS/queuemanager.py
@@ -506,7 +506,7 @@ class AlertQueueManager(TimedQueueManager):

     def do_work(self, device_id, group):
         logger.info("Checking alerts")
-        args = ("-d", "-f") if self.config.debug else ("-f")
+        args = ("-d", "-f") if self.config.debug else ("-f",)
         exit_code, output = LibreNMS.call_script("alerts.php", args)

         if self.config.log_output:
librenms@nms:~$

Is this a valid fix? Should I send a patch request?

EDIT: Ok, created a pull request: Update queuemanager.py: Single element args tuple breaks alerts.php running by r-duran · Pull Request #16873 · librenms/librenms · GitHub

Thank you so much for figuring this out. I was fighting this same issue for way too long trying to get alerts set up and working.
It was super difficult to find since test alerts and manually triggered alerts worked fine, but automatic ones just… didn’t send. None of the logs I could find had anything that would explain the issue.
And that was after fighting several different issues trying to get a Power Automate to Teams webhook set up. I was about to give up on alerts in LibreNMS.

I am running it in docker with the dispatcher sidecar.

1 Like

Thanks @Rahman_Duran for digging into this and finding the issue!
Looks like this has change has been approved and merged.

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.