New service logging exceptions

Updated to the new service, updated the redis nodes to 5.0.5, and redis python packages to 3.2.1
Polling seems to work just fine, but the journal is filling up with loads of exceptions.

user@librenms:~$ sudo su - librenms
$ python3
Python 3.6.7 (default, Oct 22 2018, 11:32:17)
[GCC 8.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import redis
>>> redis.__version__
>>> '3.2.1'
>>> r = redis.Redis(host='192.168.255.100', port=6378, db=0)
>>> r.info('server')['redis_version']
>>> '5.0.5'

Traceback (most recent call last):
File “/opt/librenms/LibreNMS/queuemanager.py”, line 54, in _service_worker
device_id = self.get_queue(queue_id).get(True, 10)
File “/opt/librenms/LibreNMS/ init .py”, line 307, in get
item = self._redis.bzpopmin(self.key, timeout=timeout)
File “/opt/librenms/.local/lib/python3.6/site-packages/redis/client.py”, line 2409, in bzpopmin
return self.execute_command(‘BZPOPMIN’, *keys)
File “/opt/librenms/.local/lib/python3.6/site-packages/redis/client.py”, line 775, in execute_command
return self.parse_response(connection, command_name, **options)
File “/opt/librenms/.local/lib/python3.6/site-packages/redis/client.py”, line 789, in parse_response
response = connection.read_response()
File “/opt/librenms/.local/lib/python3.6/site-packages/redis/connection.py”, line 637, in read_response
response = self._parser.read_response()
File “/opt/librenms/.local/lib/python3.6/site-packages/redis/connection.py”, line 290, in read_response
response = self._buffer.readline()
File “/opt/librenms/.local/lib/python3.6/site-packages/redis/connection.py”, line 224, in readline
self._read_from_socket()
File “/opt/librenms/.local/lib/python3.6/site-packages/redis/connection.py”, line 199, in _read_from_socket
(e.args,))
redis.exceptions.ConnectionError: Error while reading from socket: (‘Connection closed by server.’,)
Poller_0-2(ERROR):Poller poller exception! Error while reading from socket: (‘Connection closed by server.’,)
Traceback (most recent call last):
File “/opt/librenms/.local/lib/python3.6/site-packages/redis/connection.py”, line 185, in _read_from_socket
raise socket.error(SERVER_CLOSED_CONNECTION_ERROR)
OSError: Connection closed by server.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File “/opt/librenms/LibreNMS/queuemanager.py”, line 54, in _service_worker
device_id = self.get_queue(queue_id).get(True, 10)
File “/opt/librenms/LibreNMS/ init .py”, line 307, in get
item = self._redis.bzpopmin(self.key, timeout=timeout)
File “/opt/librenms/.local/lib/python3.6/site-packages/redis/client.py”, line 2409, in bzpopmin
return self.execute_command(‘BZPOPMIN’, *keys)
File “/opt/librenms/.local/lib/python3.6/site-packages/redis/client.py”, line 775, in execute_command
return self.parse_response(connection, command_name, **options)
File “/opt/librenms/.local/lib/python3.6/site-packages/redis/client.py”, line 789, in parse_response
response = connection.read_response()
File “/opt/librenms/.local/lib/python3.6/site-packages/redis/connection.py”, line 637, in read_response
response = self._parser.read_response()
File “/opt/librenms/.local/lib/python3.6/site-packages/redis/connection.py”, line 290, in read_response
response = self._buffer.readline()
File “/opt/librenms/.local/lib/python3.6/site-packages/redis/connection.py”, line 224, in readline
self._read_from_socket()
File “/opt/librenms/.local/lib/python3.6/site-packages/redis/connection.py”, line 199, in _read_from_socket
(e.args,))
redis.exceptions.ConnectionError: Error while reading from socket: (‘Connection closed by server.’,)
Discovery_0-4(ERROR):Discovery poller exception! Error while reading from socket: (‘Connection closed by server.’,)
Traceback (most recent call last):
File “/opt/librenms/.local/lib/python3.6/site-packages/redis/connection.py”, line 185, in _read_from_socket
raise socket.error(SERVER_CLOSED_CONNECTION_ERROR)
OSError: Connection closed by server.

This seems to have been caused by our haproxy positioned in front of the redis servers… pointing .env REDIS_HOST directly to the redis master gets rid of all the exceptions. Guess I’ll have to look into what’s causing the issue with haproxy!

edit: the issue was caused by too short timeouts in the haproxy config

@adammmmm I am also having issues with poller threads exiting with SERVER_CLOSED_CONNECTION_ERROR when trying to connect to Redis. I have HAproxy in front routing requests to the read/write replica of a redis sentinel cluster.

What timeout settings did you change in the HAproxy config?

Setting timeout server and timeout client in haproxy higher than your polling interval should fix this. The ansible role I used for haproxy had default timeouts set to 60s and we have a librenms polling interval of 60s… so I’m guessing if the performance thread in the dispatcher didn’t send updates within that 60s window, the connection would die.

I now have both timeout settings in haproxy set to 120s (and also our redis server tcp-keepalive set to 60s), hopefully that works!

edit: typos