Messages from rsyslog aren't making it into librenms

I have 2 librenms systems setup pretty much identically. Working system, syslogs are being pulled into the database and viewable in LibreNMS. Non-working system, rsyslog is receiving the syslogs (they are saved both to a local file and LibreNMS per-configuration), but nothing showing up in LibreNMS. I have a remote probe that’s doing remote polling and syslog collection, and the main server with the database and everything on it. I use rsyslog to forward the syslogs from the probe to the main server and then followed the instructions to setup the 30-librenms.conf file in /etc/rsyslog.d/. One system is working, one is not. I can’t find any differences between them except that one is monitoring pretty much all private IP’d devices (RFC1918), and the other is monitoring public IP’d devices. Devices are all MikroTik and Ubiquiti in both systems, though slightly different models (not sure if this would make a difference, can’t imagine it would)

On the non-working system, I did notice at first that I was using the %hostname% variable in my conf file, which since these IP’s have a reverse lookup zone setup they were resolving to names that I didn’t want. Most of my devices are added via IP since I don’t have DNS setup for most of them. So I put this statement in to disable DNS lookups in rsyslog global(net.enableDNS="off"). Once I did that, the folders created had the IP’s instead of the reverse DNS name, win? Nope.

On both the working and non-working systems I get these errors in rsyslog, and even with debugging on in rsyslog I don’t see any obvious errors around these terminate/resume messages.

rsyslogd[2277531]: omprog: program '/opt/librenms/syslog.php' (pid 2277552) terminated; will be restarted [v8.2001.0 try https://www.rsyslog.com/e/2119 ]
rsyslogd[2277531]: action 'action-3-omprog' suspended (module 'omprog'), retry 0. There should be messages before this one giving the reason for suspension. [v8.2001.0 try https://www.rsyslog.com/e/2007 ]
rsyslogd[2277531]: action 'action-3-omprog' resumed (module 'omprog') [v8.2001.0 try https://www.rsyslog.com/e/2359 ]
rsyslogd[2277531]: child process (pid 2277574) exited with status 255 [v8.2001.0]
rsyslogd[2277531]: omprog: program '/opt/librenms/syslog.php' (pid 2277574) terminated; will be restarted [v8.2001.0 try https://www.rsyslog.com/e/2119 ]
rsyslogd[2277531]: action 'action-3-omprog' suspended (module 'omprog'), retry 0. There should be messages before this one giving the reason for suspension. [v8.2001.0 try https://www.rsyslog.com/e/2007 ]
rsyslogd[2277531]: action 'action-3-omprog' resumed (module 'omprog') [v8.2001.0 try https://www.rsyslog.com/e/2359 ]
rsyslogd[2277531]: child process (pid 2277575) exited with status 255 [v8.2001.0]
rsyslogd[2277531]: omprog: program '/opt/librenms/syslog.php' (pid 2277575) terminated; will be restarted [v8.2001.0 try https://www.rsyslog.com/e/2119 ]
rsyslogd[2277531]: action 'action-3-omprog' suspended (module 'omprog'), retry 0. There should be messages before this one giving the reason for suspension. [v8.2001.0 try https://www.rsyslog.com/e/2007 ]

Does anyone have any ideas on why this might not be working? It’s still not 100% clear to me how LibreNMS matches up logs to devices whether via IP or sysname or whatever, but I’m not seeing any logs at all, even when looking under Overview>Syslog. l’ve even tried setting the date way back in case something was goofed up there, but unless it’s coming in as pre-1900 that’s not it either. I’m kind of lost and unsure where to look further.

Thanks in advance for any help anyone might be able to offer.

And just to appease the requirements here’s my validate scripts
Non-working server:

====================================
Component | Version
--------- | -------
LibreNMS  | 21.12.1
DB Schema | 2021_11_29_165436_improve_ports_search_index (229)
PHP       | 7.4.3
Python    | 3.8.10
MySQL     | 10.3.32-MariaDB-0ubuntu0.20.04.1-log
RRDTool   | 1.7.2
SNMP      | 5.8
====================================

[OK]    Composer Version: 2.2.4
[OK]    Dependencies up-to-date.
[OK]    Database connection successful
[OK]    Database schema correct
[INFO]  Detected Python Wrapper
[OK]    Connection to memcached is ok

Non-working probe:

====================================
Component | Version
--------- | -------
LibreNMS  | 21.12.1
DB Schema | 2021_11_29_165436_improve_ports_search_index (229)
PHP       | 7.4.3
Python    | 3.8.10
MySQL     | 10.3.32-MariaDB-0ubuntu0.20.04.1-log
RRDTool   | 1.7.2
SNMP      | 5.8
====================================

[OK]    Composer Version: 2.2.4
[OK]    Dependencies up-to-date.
[OK]    Database connection successful
[OK]    Database schema correct
[INFO]  Detected Python Wrapper
[OK]    Connection to memcached is ok

Working server:

====================================
Component | Version
--------- | -------
LibreNMS  | 21.12.1
DB Schema | 2021_11_29_165436_improve_ports_search_index (229)
PHP       | 7.4.3
Python    | 3.8.10
MySQL     | 10.3.32-MariaDB-0ubuntu0.20.04.1-log
RRDTool   | 1.7.2
SNMP      | 5.8
====================================

[OK]    Composer Version: 2.2.4
[OK]    Dependencies up-to-date.
[OK]    Database connection successful
[OK]    Database schema correct
[INFO]  Detected Python Wrapper
[OK]    Connection to memcached is ok

Working Probe:

====================================
Component | Version
--------- | -------
LibreNMS  | 21.12.1
DB Schema | 2021_11_29_165436_improve_ports_search_index (229)
PHP       | 7.4.3
Python    | 3.8.10
MySQL     | 10.3.32-MariaDB-0ubuntu0.20.04.1-log
RRDTool   | 1.7.2
SNMP      | 5.8
====================================

[OK]    Composer Version: 2.2.4
[OK]    Dependencies up-to-date.
[OK]    Database connection successful
[OK]    Database schema correct
[INFO]  Detected Python Wrapper
[OK]    Connection to memcached is ok

Bump anyone?

Uncomment the logfile() line in syslog.php and see what gets logged first.

So I did this, should it be putting logs into /opt/librenms/logs/librenms.log or some other place? I don’t see a syslog.log in /opt/librenms/logs or anything.

Should be in /opt/librenms/logs/librenms.log

If it’s not appearing in there then the request isn’t making it to our processing script - you’ll need to debug from the rsyslog end.

This may sound silly or obvious but have you tried restarting the rsyslog service ? (not just sending it a refresh signal)

I have had a couple of instances where after adding or removing devices in LibreNMS I do not get any syslogging for those devices showing up in LibreNMS but it does show up in the rsyslog log file.

For me restarting the rsyslog systemd service fixed the issue. I think the underlying cause for this is that rsyslog spawns a persistently running copy of /opt/librenms/syslog.php to pipe data to, for whatever reason this syslog.php script is not always aware of changes to devices in the LibreNMS database and this can result in some devices not being logged until syslog.php is restarted via restarting rsyslogd.

As for how LibreNMS matches up syslog entries to LibreNMS devices - I think it just goes by the IP address.

I’m just using the default rsyslog config:

# Feed syslog messages to librenms
module(load="omprog")

template(name="librenms"
        type="string"
        string= "%fromhost%||%syslogfacility%||%syslogpriority%||%syslogseverity%||%syslogtag%||%$year%-%$month%-%$day% %timegenerated:8:25%||%msg%||%programname%\n")
        action(type="omprog"
        binary="/opt/librenms/syslog.php"
        template="librenms")

& stop

Yeah I think the only thing I tweaked from the standard one was using %fromip% instead of %fromhost%. I’ve restarted the service several times and the whole box has been rebooted as well. I have 2 other completely unrelated installs where I have the exact same setup and it’s working there, which adds to the mystery.

Does it work if you change it back to the default like I have above ?

If it doesn’t match the incoming syslog messages to a device in the LibreNMS database the messages will be discarded, so the change you’ve made could potentially have an effect.

Also are your syslog messages coming directly from the devices in question or being forwarded via another syslog distribution server first ?

Pretty sure I had tried that but I might have to go back and try again. It’s been too long. I’ve just been having to grep through stuff manually.