Multiple errors in ./validate.php

Hi @kalamchi75

I continue to see these errors when I check the status.

The other thing is I am not able to get rrdcached to start talking on the 42217 port. I have tried to configure it as (hostname/ip):42217 in the config but that doesn’t help. Moreover it breaks by graphs with the error message that it is not able to connect to rrdcached. :frowning:

I don’t know how to go further from here. I double-checked the config output but now I see that /opt/librenms/rrd is mentioned. As for /var/tmp… I still don’t know why that is coming up and I have not changed anything to make it come up that way. I tried an install on my raspberrypi device and it comes up as the same.

Thanks,
Santosh Kotla

@kalamchi75

The rrdcached file config for your reference:

root@dc5up-vlibrenms01:~# cat /etc/default/rrdcached
DAEMON=/usr/bin/rrdcached
DAEMON_USER=librenms
DAEMON_GROUP=librenms
WRITE_THREADS=4
WRITE_TIMEOUT=1800
WRITE_JITTER=1800
BASE_PATH=/opt/librenms/rrd/
JOURNAL_PATH=/var/lib/rrdcached/journal/
PIDFILE=/run/rrdcached.pid
SOCKFILE=/run/rrdcached.sock
SOCKGROUP=librenms
BASE_OPTIONS="-B -F -R"

BASE_OPTIONS="-l 0:42217"
BASE_OPTIONS="$BASE_OPTIONS -R -j /var/lib/rrdcached/journal/ -F"
BASE_OPTIONS="$BASE_OPTIONS -b /opt/librenms/rrd -B"
BASE_OPTIONS="$BASE_OPTIONS -w 1800 -z 900"

Thanks,
Santosh Kotla

ok
We are sure your rrd files are in /opt/librenms/rrd ?
give full R/W permission to /var/tmp just in case.

On your RP, did you use Ubuntu or CentOS ?
from the poller, do you still see port 42217 closed ?

Hi @kalamchi75

Please find my answers inline:
We are sure your rrd files are in /opt/librenms/rrd? - Yes Sir.
give full R/W permission to /var/tmp just in case - Should I change the ownership on them to librenms:librenms?

On your RP, did you use Ubuntu or CentOS ? CentOS-8
from the poller, do you still see port 42217 closed ? Yes, when I run the nmap specifying the central server I see the RP and the central server itself having the 42217 closed.

Thanks,
Santosh Kotla

Hi Santosh,

Have you tried using remote port instead of sock?
This is what I have.
Remote poller:

$config['rrdtool'] = '/database/rrdtool-1.7.2/bin/rrdtool';
$config['rrdtool_version'] = '1.7.2';
$config['rrd_dir'] = '/database/rrdtool-1.7.2/libredata';
$config['rrdcached'] = 'rrdserver:42217';

RRDserver:

cat /etc/systemd/system/rrdcached.service
[Unit]
Description=Data caching daemon for rrdtool
After=network.service

[Service]
Type=forking
PIDFile=/database/rrdtool-1.7.2/var/run/rrdcached.pid
ExecStart=/database/rrdtool-1.7.2/bin/rrdcached -w 1800 -z 1800 -f 3600 -s librenms -U librenms -G librenms -B -R -j /var/tmp -l rrdserver:42217 -t 4 -F -b /database/rrdtool-1.7.2/libredata

[Install]
WantedBy=default.target

Hi @mzacchi

Where can I find this database folder containing the rrdtool folder? I tried to look into my database folder and I don’t see it in there.

root@dc5up-vlibrenms01:~# ls /opt/librenms/database/
factories migrations schema seeders seeds

Thanks,
Santosh Kotla

Hi @kalamchi75

I tried the option of changing the ownership of /var/tmp to librenms but that seems to have induced another error:

root@dc5up-vlibrenms01:~# systemctl status rrdcached
● rrdcached.service - Data caching daemon for rrdtool
Loaded: loaded (/etc/systemd/system/rrdcached.service; enabled; vendor preset: enabled)
Active: active (running) since Mon 2021-09-20 13:28:59 UTC; 51s ago
Process: 22040 ExecStart=/usr/bin/rrdcached -w 1800 -z 1800 -f 3600 -s librenms -U librenms -G librenms -B -R -j
Main PID: 22047 (rrdcached)
Tasks: 337 (limit: 4915)
CGroup: /system.slice/rrdcached.service
└─22047 /usr/bin/rrdcached -w 1800 -z 1800 -f 3600 -s librenms -U librenms -G librenms -B -R -j /var/tmp

Sep 20 13:28:59 dc5up-vlibrenms01 systemd[1]: Starting Data caching daemon for rrdtool…
Sep 20 13:28:59 dc5up-vlibrenms01 systemd[1]: rrdcached.service: Failed to parse PID from file /run/rrdcached.pid:
Sep 20 13:28:59 dc5up-vlibrenms01 systemd[1]: Started Data caching daemon for rrdtool.
Sep 20 13:29:40 dc5up-vlibrenms01 rrdcached[22047]: handle_request_update: Could not read RRD file.
Sep 20 13:29:40 dc5up-vlibrenms01 rrdcached[22047]: handle_request_update: Could not read RRD file.

So I rolled back changes and hit a service restart.

Thanks,
Santosh Kotla

Hi Santosh,

In my case, the rrd files are all under /opt/librenms/rrd directory.
Do you have the same directory structure ?
please find where your rrd files are located.

Hi @kalamchi75

Yes. I have the same directory structure:

Thanks,
Santosh Kotla

Hi Santosh,

This is a custom directory, it is not the standard one.
On $config['rrdtool'] you should define the location of your rrdtool binaries and on $config['rrd_dir'] the location of your RRD files.
On your case, /opt/librenms/rrd
But the real big difference is the parameter in the service file of the central host and in the config.php of the remote poller, where you assign a port for RRDCached connection:

$config['rrdcached'] = 'rrdserver:42217'; on the poller
[Service] Type=forking PIDFile=/database/rrdtool-1.7.2/var/run/rrdcached.pid ExecStart=/database/rrdtool-1.7.2/bin/rrdcached -w 1800 -z 1800 -f 3600 -s librenms -U librenms -G librenms -B -R -j /var/tmp -l rrdserver:42217 -t 4 -F -b /database/rrdtool-1.7.2/libredata on the central server, but adjusting the path to fit your deployment.

Hi @mzacchi

Thanks for your suggestion Sir. I have tried to implement as you suggested. I went ahead and changed the statement in the rrdcached.service file and I am able to see that the port 42217 is now open on the server side.

Starting Nmap 7.60 ( https://nmap.org ) at 2021-09-21 07:13 UTC
Nmap scan report for DC5UP-vLibreNMS01.marvell.com (10.69.176.104)
Host is up (0.000057s latency).

PORT STATE SERVICE
42217/tcp open unknown

Central Server RRDCACHED.service:
root@dc5up-vlibrenms01:~# systemctl status rrdcached
● rrdcached.service - Data caching daemon for rrdtool
Loaded: loaded (/etc/systemd/system/rrdcached.service; enabled; vendor preset: enabled)
Active: active (running) since Tue 2021-09-21 07:12:13 UTC; 33s ago
Process: 3867 ExecStart=/usr/bin/rrdcached -w 1800 -z 1800 -f 3600 -s librenms -U librenms -G librenms -B -R -j /var/tmp -l rrdserverIP:42217 -t 4 -F -b /opt/librenms/rrd/ (code=exited, status=0/SUCCESS)
Main PID: 3871 (rrdcached)
Tasks: 45 (limit: 4915)
CGroup: /system.slice/rrdcached.service
└─3871 /usr/bin/rrdcached -w 1800 -z 1800 -f 3600 -s librenms -U librenms -G librenms -B -R -j /var/tmp -l rrdserverIP:42217 -t 4 -F -b /opt/librenms/rrd/

This is a snippet of my config file:
Server Side:

Enable the in-built services support (Nagios plugins)

$config[‘show_services’] = 1;
$config[“update_channel”] = “release”;
$config[‘rrdcached’] = “unix:/run/rrdcached.sock”;
#$config[‘rrdcached’] = “localhost:42217”;
#$config[‘rrdcached’] = “centralserverIP:42217”;
#$config[‘rrdtool’] = ‘/usr/bin/rrdtool’;
#$config[‘rrd_dir’] = ‘/opt/librenms/rrd’;
#$config[‘rrdtool_version’] = ‘1.7.0’;

#Interface description Customization
$config[‘custom_descr’][] = “MPLS”;
$config[‘custom_descr’][] = “MAN”;
$config[‘custom_descr’][] = “DIA”;

#Syslog requirements
$config[‘enable_syslog’] = 1;

// Distributed Poller-Settings
$config[‘distributed_poller’] = true;
// optional: defaults to hostname
$config[‘distributed_poller_name’] = php_uname(‘n’);
$config[‘distributed_poller_group’] = ‘0’;
$config[‘distributed_poller_memcached_host’] = ‘hostname.domain.com’;
$config[‘distributed_poller_memcached_port’] = 11211;

Poller:
#$config[‘rrdcached’] = “unix:/var/run/rrdcached.sock”;
$config[‘rrdcached’] = “centralserverIP:42217”;
$config[‘rrdtool’] = ‘/usr/bin/rrdtool’;
$config[‘rrd_dir’] = ‘/opt/librenms/rrd’;
$config[‘rrdtool_version’] = ‘1.7.0’;

// Distributed Poller-Settings
$config[‘distributed_poller’] = true;
// optional: defaults to hostname
$config[‘distributed_poller_name’] = php_uname(‘n’);
$config[‘distributed_poller_group’] = ‘1’;
$config[‘distributed_poller_memcached_host’] = ‘centralserverhostname.domain.com’;
$config[‘distributed_poller_memcached_port’] = 11211;

Please let me know if this is in alignment with what you have suggested.

@kalamchi75
Is the nmap output supposed to show what service it is open for?

Thanks,
Santosh Kotla

Hi Santosh,

You have to do some comment/uncomment changes so it looks like this:

On the Remote Poller:
#$config[‘rrdcached’] = “unix:/run/rrdcached.sock”;
#$config[‘rrdcached’] = “localhost:42217”;
$config[‘rrdcached’] = “centralserverIP:42217”;
$config[‘rrdtool’] = ‘/usr/bin/rrdtool’;
#$config[‘rrd_dir’] = ‘/opt/librenms/rrd’;
$config[‘rrdtool_version’] = ‘1.7.0’;

On the Central Server:
#$config[‘rrdcached’] = “unix:/run/rrdcached.sock”;
#$config[‘rrdcached’] = “localhost:42217”;
$config[‘rrdcached’] = “centralserverIP:42217”;
$config[‘rrdtool’] = ‘/usr/bin/rrdtool’;
$config[‘rrd_dir’] = ‘/opt/librenms/rrd’;
$config[‘rrdtool_version’] = ‘1.7.0’;

@mzacchi

Done Sir. My graphs are back after I uncommented the IP:42217 config line… Looks like the central server is good with the IP now.

How about the rrdcached.service file for the poller? Shall I edit so that it points to the centralserverIP:42217 as well?

Thanks,
Santosh Kotla

Great, glad to hear it.
You don’t need the rrdcached in place on the remote poller, only on the RRD server (central server).

@mzacchi @kalamchi75

Thanks so much for throwing in your suggestions. :smiley: This is a very long struggle for me and you have no idea of how much help you both have given me so far.

I will work on setting up my poller and will let you know how that goes.

Thanks again,
Santosh Kotla

1 Like

You are most welcome Santosh.
Good luck and let us know how it goes.

Hi @kalamchi75

So far I managed to knock out the time difference issue between the poller and the central server. It so happened that NTP needed to be configured on the central server and I had to hard update the Hardware Clock to sync with the system time. With that taken care of I am left with two other things:

  1. it tells me that php-memcached is not installed even when it is.
  2. it tells me that the dispatcher service is available on the cluster but not on the node.

I have followed the documents for this and got some help from friends from the Server-Engg team but seems like some packages are deprecated on the CentOS-8 and that’s why things are not as smooth as expected. I don’t know if this will let me pass through or if I will need to step down the VM instance to a CentOS 7 or a Ubuntu 20.04.

Thanks,
Santosh Kotla

Rule #1 Don’t give up (at least not yet) :wink:
Try to search on the php-memcached and CentOS 8 issues.

If you ask me, I would take Ubuntu over CentOS any day. but that’s just a personal preference and experience.

As for the dispatcher, i’m not sure really, and I don’t use cluster. My setup is simple with two remote pollers report to the master.
At this point, you might not even need to worry about memcached. Are your poller(s) connecting to the master ?

Hi @kalamchi75

I am so relieved that you’re speaking my mind. I thought it would be stupid to ask if I could lookover these issues.

To answer your question, Yes! they are reporting back to the central server. For now I have just one device in the poller that is doing the compute and throwing it back to the central server and that seems to be fine. I am planning to do a mysql change statement to a bulk of devices. Like maybe 15-20 of them and see how they report back. I would expect to see a spike in the CPU utilization of my poller.

What are your thoughts on that sir?

Thanks,
Santosh Kotla

Hi Santosh,

Why would change MySQL ?