Multiple errors in ./validate.php

Sure Thing @kalamchi75

The GUI seems to be working just fine now. No issues and sluggishness seen so far. The CPU is still at above 85% at least.

I will send you the recent graphs tomorrow morning. Thanks for all your time and help today Sir. :slight_smile:

Thanks,
Santosh Kotla

you are most welcome sir.
and I apologize for my limited knowledge.

The more we troubleshoot the more we learn :smiley:

1 Like

Hi @kalamchi75

So far, the graphs are looking smooth. However the memory seems to be rising but I am not sure if that the lighter part of the graph is actually going to affect the system performance.

CPU:

Memory:

HTOP:

So from your observation, if you can confirm that the parameters are below threshold (except for the CPU which I think will come down after I offload some of the devices to the poller) how do you suggest we start troubleshooting the rrdcached?

Thanks,
Santosh Kotla

Hi @kalamchi75

For a start, I have ensured that the ownership of the rrdcached files is with librenms using the command from the install guide and then enabled the rrdcached command in the config that points to the /var/run/rrdcacahed directory. I did a validate and see that there are no issues.

This is what the status of rrdcached looks like:

Will observe the system for a few minutes and keep an eye on the graphs and the htop as well.

Thanks,
Santosh Kotla

Hi @kalamchi75

Bad luck sir :frowning:

Looks like rrdcached is still not happy:

Any suggestions from your side?

Thanks,
Santosh Kotla

Hi Snatosh,

I think the memory is fine.
The CPU though looks loaded still,. but perhaps as you said offloading some machines to the remote pollers might bring it down.

Now, rrdcached, I am not sure why is it complaining that it can;t read rrd files.
Can you confirm which user owns the directory /opt/librenms/rrd and all its files and sub directories?

Hi @kalamchi75

I have run the command again at the install documents to make sure that the files and subfolders are owned by librenms. Is there a way I can check on the current ownership of them?

On a second note, I see that rrdcached is not happy but looks like my graphs are still plotting! I don’t know what that means… :roll_eyes:

Thanks,
Santosh Kotla

You can simply do

     cd /opt/librenms 
     ls -lah 

This will show you the directories/files and their ownership in the librenms directory
you can then

cd rrd 
ls -lah 

This will show you the files inside rrd directory their ownership

send me again the output of

systemctl status rrdcached

Hi @kalamchi75

I doublechecked and the ownership of ALL the files are under librenms. The output is quite big so I don’t know how to send it to you. If you can please send me the link where I can create a temporary file and share you the link, I will send it across.

This is the status of rrdcached:

Thanks,
Santosh Kotla

That’s fine, if you can see the files there owned by librenms:librenms that’s enough you don’t have to send it.
Please resend the status showing the full lines (after -B -R …)

and send me your rrdcached configuration. I will compare it against mine. You mask any passwords or IP addresses.

Hi @kalamchi75

This is the rrdcached config file:
DAEMON=/usr/bin/rrdcached
DAEMON_USER=librenms
DAEMON_GROUP=librenms
WRITE_THREADS=4
WRITE_TIMEOUT=1800
WRITE_JITTER=1800
BASE_PATH=/opt/librenms/rrd/
JOURNAL_PATH=/var/lib/rrdcached/journal/
PIDFILE=/run/rrdcached.pid
SOCKFILE=/run/rrdcached.sock
SOCKGROUP=librenms
BASE_OPTIONS="-B -F -R"

BASE_OPTIONS="-l 0:42217"
BASE_OPTIONS="$BASE_OPTIONS -R -j /var/lib/rrdcached/journal/ -F"
BASE_OPTIONS="$BASE_OPTIONS -b /opt/librenms/rrd -B"
BASE_OPTIONS="$BASE_OPTIONS -w 1800 -z 900"

Output of the requested rrdcached status lines:
Sep 15 10:45:53 dc5up-vlibrenms01 rrdcached[12342]: handle_request_update: Could not read RRD file.
Sep 15 10:45:53 dc5up-vlibrenms01 rrdcached[12342]: handle_request_update: Could not read RRD file.
Sep 15 10:46:03 dc5up-vlibrenms01 rrdcached[12342]: handle_request_update: Could not read RRD file.
Sep 15 10:46:03 dc5up-vlibrenms01 rrdcached[12342]: handle_request_update: Could not read RRD file.
Sep 15 10:46:03 dc5up-vlibrenms01 rrdcached[12342]: handle_request_update: Could not read RRD file.
Sep 15 10:46:03 dc5up-vlibrenms01 rrdcached[12342]: handle_request_update: Could not read RRD file.
Sep 15 10:46:04 dc5up-vlibrenms01 rrdcached[12342]: handle_request_update: Could not read RRD file.
Sep 15 10:46:04 dc5up-vlibrenms01 rrdcached[12342]: handle_request_update: Could not read RRD file.
Sep 15 10:46:04 dc5up-vlibrenms01 rrdcached[12342]: handle_request_update: Could not read RRD file.
Sep 15 10:46:04 dc5up-vlibrenms01 rrdcached[12342]: handle_request_update: Could not read RRD file.

Thanks,
Santosh Kotla

Hi @kalamchi75

I just went by confidence of the graphs that I am seeing here even after rrdcached has been put into effect and did the basic configuration on the poller. It comes up as a poller but I don’t see it as a part of the cluster yet.

I have installed rrdcached and memcached on the remote poller. It will be polling for group 1 and the central server is polling for group 0.

Thanks,
Santosh Kotla

I don’t think you need rrdcached or memcached on the remote poller. It should connect to the rrdcached and memcached of the master server.
I’m not getting what you mean by cluster ?

Hi @kalamchi75

Alrighty. I have installed them but if you want me to remove it, that’s possible or I can just stop the processes.

By cluster I mean, it doesn’t show up in the poller cluster.

The main central server doesn’t show so many issues with the database but the poller complains about it and asks me run commands. I just came to know that the rrdcached and the memcached are not able to speak to the main cluster.

[librenms@dc4up-vlibrenms02 ~]$ ./validate.php

Component Version
LibreNMS 21.8.0-55-gf3fa2ce1e
DB Schema 2021_25_01_0129_isis_adjacencies_nullable (217)
PHP 7.3.20
Python 3.6.8
MySQL 10.5.12-MariaDB-1:10.5.12+maria~bionic
RRDTool 1.7.0
SNMP NET-SNMP 5.8
====================================

[OK] Composer Version: 2.1.7
[OK] Dependencies up-to-date.
[OK] Database connection successful
[FAIL] Time between this server and the mysql database is off
Mysql time 2021-09-15 12:09:23
PHP time 2021-09-15 12:06:37

[FAIL] Database: incorrect column (config/config_value)
[FAIL] Database: incorrect column (isis_adjacencies/port_id)
[FAIL] Database: incorrect column (isis_adjacencies/isisISAdjNeighSysType)
[FAIL] Database: incorrect column (isis_adjacencies/isisISAdjNeighSysID)
[FAIL] Database: incorrect column (isis_adjacencies/isisISAdjNeighPriority)
[FAIL] Database: incorrect column (isis_adjacencies/isisISAdjLastUpTime)
[FAIL] Database: incorrect column (isis_adjacencies/isisISAdjAreaAddress)
[FAIL] Database: incorrect column (isis_adjacencies/isisISAdjIPAddrType)
[FAIL] Database: incorrect column (isis_adjacencies/isisISAdjIPAddrAddress)
[FAIL] Database: missing column (isis_adjacencies/isisCircAdminState)
[FAIL] Database: extra column (ports/ifHighSpeed)
[FAIL] Database: extra column (ports/ifHighSpeed_prev)
[FAIL] We have detected that your database schema may be wrong, please report the following to us on Discord (LibreNMS) or the community site (Report database schema issues here - LibreNMS Community):
[FIX]:
Run the following SQL statements to fix.
SQL Statements:
ALTER TABLE config CHANGE config_value config_value mediumtext NOT NULL ;
ALTER TABLE isis_adjacencies CHANGE port_id port_id int NULL ;
ALTER TABLE isis_adjacencies CHANGE isisISAdjNeighSysType isisISAdjNeighSysType varchar(128) NULL ;
ALTER TABLE isis_adjacencies CHANGE isisISAdjNeighSysID isisISAdjNeighSysID varchar(128) NULL ;
ALTER TABLE isis_adjacencies CHANGE isisISAdjNeighPriority isisISAdjNeighPriority varchar(128) NULL ;
ALTER TABLE isis_adjacencies CHANGE isisISAdjLastUpTime isisISAdjLastUpTime bigint unsigned NULL ;
ALTER TABLE isis_adjacencies CHANGE isisISAdjAreaAddress isisISAdjAreaAddress varchar(128) NULL ;
ALTER TABLE isis_adjacencies CHANGE isisISAdjIPAddrType isisISAdjIPAddrType varchar(128) NULL ;
ALTER TABLE isis_adjacencies CHANGE isisISAdjIPAddrAddress isisISAdjIPAddrAddress varchar(128) NULL ;
ALTER TABLE isis_adjacencies ADD isisCircAdminState varchar(16) NOT NULL DEFAULT ‘off’ AFTER isisISAdjIPAddrAddress;
ALTER TABLE ports DROP ifHighSpeed;
ALTER TABLE ports DROP ifHighSpeed_prev;
[INFO] Detected Dispatcher Service
[FAIL] Dispatcher service is enabled on your cluster, but not in use on this node
[FAIL] Missing PHP extension: memcached
[FIX]:
Please install memcached
[FAIL] Cannot connect to rrdcached instance
[FAIL] Cannot connect to rrdcached instance

Thanks,
Santosh Kotla

Run:

su librenms
git pull 
./daily.sh 

once done, run
./validate.php again

then we need to fix the following

[FIX]:
[FAIL] Missing PHP extension: memcached
Please install memcached
[FAIL] Cannot connect to rrdcached instance
[FAIL] Cannot connect to rrdcached instance

install the php extension, then in your librenms config file, you need to point your poller to connect to mamcached and rrdcached in the main server.

show me please how you enable rrdcached and memcached in both configuration files from the master and poller

Hi @kalamchi75

Sure. I will run all of that and send you the config snippets of both the master and the poller.

./daily.sh seems to be taking a little long at the cleaning DB part at the poller.

Thanks,
Santosh Kotla

Yep, it takes its sweet time. But that’s normal, give it the time that it needs.

Hi @kalamchi75

This is what I see after running the daily.sh and the validate.

[root@dc4up-vlibrenms02 ~]# su - librenms
Last login: Wed Sep 15 12:06:22 UTC 2021 on pts/0
[librenms@dc4up-vlibrenms02 ~]$ ./daily.sh
Updating to latest codebase OK
Updating Composer packages OK
Updating SQL-Schema OK
Updating submodules OK
Cleaning up DB OK
Fetching notifications OK
Caching PeeringDB data OK
Caching Mac OUI data OK
[librenms@dc4up-vlibrenms02 ~]$ ./validate.php

Component Version
LibreNMS 21.8.0-55-gf3fa2ce1e
DB Schema 2021_25_01_0129_isis_adjacencies_nullable (217)
PHP 7.3.20
Python 3.6.8
MySQL 10.5.12-MariaDB-1:10.5.12+maria~bionic
RRDTool 1.7.0
SNMP NET-SNMP 5.8

====================================

[OK] Composer Version: 2.1.7
[OK] Dependencies up-to-date.
[OK] Database connection successful
[FAIL] Time between this server and the mysql database is off
Mysql time 2021-09-15 15:54:22
PHP time 2021-09-15 15:51:36

[FAIL] Database: incorrect column (config/config_value)
[FAIL] Database: incorrect column (isis_adjacencies/port_id)
[FAIL] Database: incorrect column (isis_adjacencies/isisISAdjNeighSysType)
[FAIL] Database: incorrect column (isis_adjacencies/isisISAdjNeighSysID)
[FAIL] Database: incorrect column (isis_adjacencies/isisISAdjNeighPriority)
[FAIL] Database: incorrect column (isis_adjacencies/isisISAdjLastUpTime)
[FAIL] Database: incorrect column (isis_adjacencies/isisISAdjAreaAddress)
[FAIL] Database: incorrect column (isis_adjacencies/isisISAdjIPAddrType)
[FAIL] Database: incorrect column (isis_adjacencies/isisISAdjIPAddrAddress)
[FAIL] Database: missing column (isis_adjacencies/isisCircAdminState)
[FAIL] Database: extra column (ports/ifHighSpeed)
[FAIL] Database: extra column (ports/ifHighSpeed_prev)
[FAIL] We have detected that your database schema may be wrong, please report the following to us on Discord (https://t.libren.ms/discord) or the community site (https://t.libren.ms/5gscd):
[FIX]:
Run the following SQL statements to fix.
SQL Statements:
ALTER TABLE config CHANGE config_value config_value mediumtext NOT NULL ;
ALTER TABLE isis_adjacencies CHANGE port_id port_id int NULL ;
ALTER TABLE isis_adjacencies CHANGE isisISAdjNeighSysType isisISAdjNeighSysType varchar(128) NULL ;
ALTER TABLE isis_adjacencies CHANGE isisISAdjNeighSysID isisISAdjNeighSysID varchar(128) NULL ;
ALTER TABLE isis_adjacencies CHANGE isisISAdjNeighPriority isisISAdjNeighPriority varchar(128) NULL ;
ALTER TABLE isis_adjacencies CHANGE isisISAdjLastUpTime isisISAdjLastUpTime bigint unsigned NULL ;
ALTER TABLE isis_adjacencies CHANGE isisISAdjAreaAddress isisISAdjAreaAddress varchar(128) NULL ;
ALTER TABLE isis_adjacencies CHANGE isisISAdjIPAddrType isisISAdjIPAddrType varchar(128) NULL ;
ALTER TABLE isis_adjacencies CHANGE isisISAdjIPAddrAddress isisISAdjIPAddrAddress varchar(128) NULL ;
ALTER TABLE isis_adjacencies ADD isisCircAdminState varchar(16) NOT NULL DEFAULT ‘off’ AFTER isisISAdjIPAddrAddress;
ALTER TABLE ports DROP ifHighSpeed;
ALTER TABLE ports DROP ifHighSpeed_prev;
[INFO] Detected Dispatcher Service
[FAIL] Dispatcher service is enabled on your cluster, but not in use on this node
[FAIL] Missing PHP extension: memcached
[FIX]:
Please install memcached
[FAIL] Cannot connect to rrdcached instance
[FAIL] Cannot connect to rrdcached instance

Following are the snippets from my config file for rrdcached and distributed polling. Please let me know if you want to take a look at the other parts of the config as well.

Poller:

// Distributed Poller-Settings
$config[‘distributed_poller’] = true;
// optional: defaults to hostname
$config[‘distributed_poller_name’] = php_uname(‘n’);
$config[‘distributed_poller_group’] = ‘1’;
$config[‘rrdcached’] = “dc5up-vlibrenms01.example.com:42217”;
#$config[‘rrdtool_version’] = ‘1.7.0’;
$config[‘distributed_poller_memcached_host’] = ‘dc5up-vlibrenms01.example.com’;
$config[‘distributed_poller_memcached_port’] = 11211;

Central Server:

// Distributed Poller-Settings
$config[‘distributed_poller’] = true;
// optional: defaults to hostname
$config[‘distributed_poller_name’] = php_uname(‘n’);
$config[‘distributed_poller_group’] = ‘0’;
#$config[‘rrdcached’] = “dc5up-vlibrenms01.example.com:42217”;
$config[‘distributed_poller_memcached_host’] = ‘dc5up-vlibrenms01.example.com’;
$config[‘distributed_poller_memcached_port’] = 11211;

Thanks,
Santosh Kotla

Hi @kalamchi75

The memcached service is running as per the service status:

[root@dc4up-vlibrenms02 ~]# service memcached status
Redirecting to /bin/systemctl status memcached.service
● memcached.service - memcached daemon
Loaded: loaded (/usr/lib/systemd/system/memcached.service; disabled; vendor preset: disabled)
Active: active (running) since Wed 2021-09-15 12:06:17 UTC; 16h ago
Main PID: 1269403 (memcached)
Tasks: 10 (limit: 76498)
Memory: 3.6M
CGroup: /system.slice/memcached.service
└─1269403 /usr/bin/memcached -p 11211 -u memcached -m 64 -c 1024 -l 127.0.0.1,::1

Sep 15 12:06:17 dc4up-vlibrenms02 systemd[1]: Started memcached daemon.

But it still asks me to install memcached.

Any thoughts on this one Sir?

Thanks,
Santosh Kotla

Good morning Santosh,

You can call me Ali by the way :slight_smile:
Here is what i think :

your remote poller is unable to connect to your master’s rrdcached service. Check any firewall that might blocking access. on your remote poller run:

nmap -p 42217 <YOUR MASTER SERVER IP>

and see what’s the status of the port.
Do the same command and test port 11211 as well.

As for memcached, I think it asks you to install the PHP extension. Try

apt install php-memcached 

or php7-memcached (you might need to check the exact name of the extension).

As for the database failures, perhaps you need to open a separate thread just for that, and ask LibreNMS team for advice on what to do. I am not sure why the database is issuing those failure warnings.

You definitely need to find out also why your PHP and MySQL have difference in time.