Rrdcached install error with Centos 8

I’ve completed a fresh install of LibreNMS on Centos 8 using nginx. The base installation is working fine as follows:

[[email protected] librenms]# ./validate.php
====================================
Component | Version
--------- | -------
LibreNMS  | 1.65-58-g35488d89b
DB Schema | 2020_06_23_00522_alter_availability_perc_column (171)
PHP       | 7.2.24
Python    | 3.6.8
MySQL     | 10.3.17-MariaDB
RRDTool   | 1.7.0
SNMP      | NET-SNMP 5.8
====================================

[OK]    Composer Version: 1.10.9
[OK]    Dependencies up-to-date.
[OK]    Database connection successful
[OK]    Database schema correct

As there aren’t any specific installation instructions for rrdcached on Centos 8 I followed those available for Centos 7. I created the /etc/systemd/system/rrdcached.service as specified but when I try to start the service I get the following error:

[[email protected] librenms]# systemctl enable --now rrdcached.service
Job for rrdcached.service failed because the control process exited with error code.
See "systemctl status rrdcached.service" and "journalctl -xe" for details.

systemctl status gives the following detail:

[[email protected] librenms]# systemctl status rrdcached.service
● rrdcached.service - Data caching daemon for rrdtool
   Loaded: loaded (/etc/systemd/system/rrdcached.service; enabled; vendor preset: disabled)
   Active: failed (Result: exit-code) since Tue 2020-07-28 11:29:03 BST; 3s ago
  Process: 2357 ExecStart=/usr/bin/rrdcached -w 1800 -z 1800 -f 3600 -s librenms -U librenms -G librenms -B -R -j /var/tmp -l unix:/run/rrdcached.sock -t 4 -F -b /opt/librenms/rrd/ (code=exited, status=3)

Jul 28 11:29:03 localhost.localdomain systemd[1]: Starting Data caching daemon for rrdtool...
Jul 28 11:29:03 localhost.localdomain rrdcached[2357]: Failed to create base directory '/opt/librenms/rrd/': Permission denied
Jul 28 11:29:03 localhost.localdomain systemd[1]: rrdcached.service: Control process exited, code=exited status=3
Jul 28 11:29:03 localhost.localdomain systemd[1]: rrdcached.service: Failed with result 'exit-code'.
Jul 28 11:29:03 localhost.localdomain systemd[1]: Failed to start Data caching daemon for rrdtool.

This is an error that doesn’t make a lot of sense to me as rrddcached is set to run as the librenms user in the librenms group and the /opt/librenms/rrd directory is owned by librenms/librenms:

[[email protected] librenms]# ls -lh /opt/librenms | grep rrd
drwxrwxr-x+   3 librenms librenms   47 Jul 28 11:10 rrd

I’ve not done a lot with Centos 8 so I may well be missing something obvious but does anyone have any pointers about how I can resolve this issue?

Thanks!

This is very strange. I literally installed CentOS8 today and configured RRDcached on it, and mine is working. The only difference it seems it that I configured mine to listen on IP (for distributed setup) rather than unix socket. But other than that, they are identical.

If you run journalctl -xe do you perhaps get a bit more info? Maybe a different error message?

Content of my /etc/systemd/system/rrdcached.service

[Unit]
Description=Data caching daemon for rrdtool
After=network.service

[Service]
Type=forking
PIDFile=/run/rrdcached.pid
ExecStart=/usr/bin/rrdcached -w 1800 -z 1800 -f 3600 -s librenms -U librenms -G librenms -B -R -j /var/tmp -l IP-ADDRESS:42217 -t 4 -F -b /opt/librenms/rrd/

[Install]
WantedBy=default.target
1 Like

@matthewb I have the exactly same issue building the CentOS8 VM but didnt had time check what is happening so please, if you find a fix dont forget to post it here!

Hi Hans,

Thanks very much for the response. It gives me a useful data point to work on.

journalctl -xe doesn’t really add much detail unfortunately:

Jul 29 09:11:40 localhost.localdomain systemd[1]: Starting Data caching daemon for rrdtool...
-- Subject: Unit rrdcached.service has begun start-up
-- Defined-By: systemd
-- Support: https://access.redhat.com/support
--
-- Unit rrdcached.service has begun starting up.
Jul 29 09:11:40 localhost.localdomain rrdcached[2572]: Failed to create base directory '/opt/librenms/rrd/': Permission denied
Jul 29 09:11:40 localhost.localdomain systemd[1]: rrdcached.service: Control process exited, code=exited status=3
Jul 29 09:11:40 localhost.localdomain systemd[1]: rrdcached.service: Failed with result 'exit-code'.
Jul 29 09:11:40 localhost.localdomain systemd[1]: Failed to start Data caching daemon for rrdtool.
-- Subject: Unit rrdcached.service has failed
-- Defined-By: systemd
-- Support: https://access.redhat.com/support
--
-- Unit rrdcached.service has failed.
--
-- The result is failed.

On your suggestion I tried running it with it listening to an IP address rather than a unix socket but got the same error. I also tried deleting /opt/librenms/rrd and then running it but, again, same error.

After some more poking around, though, I think I might be getting closer to the issue. I think it might tripping over selinux:

[[email protected] rrd]# ausearch -m AVC,USER_AVC,SELINUX_ERR,USER_SELINUX_ERR -ts recent
----
time->Wed Jul 29 09:33:28 2020
type=AVC msg=audit(1596011608.796:876): avc:  denied  { dac_override } for  pid=3902 comm="rrdcached" capability=1  scontext=system_u:system_r:rrdcached_t:s0 tcontext=system_u:system_r:rrdcached_t:s0 tclass=capability permissive=0

I’m going to work on this a bit more and I’ll post whatever I find but a question if I may - when you did your Centos 8 install did you configure selinux as described in the LibreNMS documentation?

Thanks again!
Matthew.

ls -lh /opt | grep librenms This could be where your permissions issue is.

Sorry, I forgot to mention SELinux is disabled on my side (I will keep it in mind in future). So I think you might be correct on the SELinux route.

These are the permissions for /opt/librenms. It looks ok to me but was there something specific you had in mind?

[[email protected] admin]# ls -lh /opt | grep librenms
drwxrwx--x. 27 librenms librenms 4.0K Jul 29 15:21 librenms

It does indeed look like it’s SELinux that’s triggering the issue. I found this page about rrdcached and selinux and so I tried the obvious:

semanage fcontext -a -t rrdcached_exec_t '/opt/librenms/rrd/(/.*)?'

but that didn’t help. Setting rrdcached to permissive, though, did stop the error from happening and rrdcached can now start:

[[email protected] admin]# semanage permissive -a rrdcached_t
[[email protected] admin]# systemctl enable --now rrdcached.service
[[email protected] admin]# systemctl status rrdcached.service
● rrdcached.service - Data caching daemon for rrdtool
   Loaded: loaded (/etc/systemd/system/rrdcached.service; enabled; vendor preset: disabled)
   Active: active (running) since Wed 2020-07-29 15:26:40 BST; 5min ago
  Process: 2199 ExecStart=/usr/bin/rrdcached -w 1800 -z 1800 -f 3600 -s librenms -U librenms -G librenms -B -R -j /var/tmp -l unix:/run/rrdcached.sock -t 4 -F -b /opt/lib>
 Main PID: 2200 (rrdcached)
    Tasks: 7 (limit: 11058)
   Memory: 1.5M
   CGroup: /system.slice/rrdcached.service
           └─2200 /usr/bin/rrdcached -w 1800 -z 1800 -f 3600 -s librenms -U librenms -G librenms -B -R -j /var/tmp -l unix:/run/rrdcached.sock -t 4 -F -b /opt/librenms/rr>

Jul 29 15:26:40 localhost.localdomain systemd[1]: Starting Data caching daemon for rrdtool...
Jul 29 15:26:40 localhost.localdomain systemd[1]: Started Data caching daemon for rrdtool.

Obviously setting it to permissive is a kludge rather than a proper fix but I’m not confident enough with selinux to know the right way to tackle this properly.

Some times it is missing x.
Why is it trying to create the directory when it already exists? Are these both on the same system?

Must be new security context for CentOS 8.

Try changing the label on the rrd directory.
https://www.mankier.com/8/rrdcached_selinux

Hi,

Thanks. That’s the page I linked to earlier. I tried a couple of things that seemed obvious but they didn’t work. I haven’t done much with selinux so I think I’m missing something important. Setting rrdcached_t to permissive allowed it to start. I know that’s not a proper long-term fix but it’s good enough for now.

What you might want to do is to install setroubleshoot-server and then run ‘sudo sealert -a /var/log/audit/audit.log’ That helps you to find denied rules and suggest what you need to do to have it allowed.

Thanks for the tip. I’ll give that a go and post the results in the next couple of days.

Hi @matthewb

Did you find the solution?