I am finding my LibreNMS Service errors, causing all polling to stop. At times, causing polling to stop for 6+ hours, until I manually (or the maintenance job) restarts the librenms-service, which activates queuing the polling tasks. This has happened twice in the past 24 hours, and has happened a dozen or so times in the past 3 weeks. Oddly it seems like the Billing, Alerting, and Services all continue to run. I also see the Watchdog detecting that the Logfile has been updated 0s ago.
I would appreciate anyone’s expertise to help me out. I have combed through the documentation multiple times to diagnose this issue over the past weeks, but I am running out of luck.
Everything is configured on one Debian VM, the Dispatcher Service configured with a Redis locking mechanism. There is not much documentation on Redis, so I initially thought that I have it misconfigured. Below are the two syslog errors which have happened in the past 24 hours.
====================================
Component | Version
--------- | -------
LibreNMS | 21.1.0-7-ge42a6e36a
DB Schema | 2020_09_19_230114_add_foreign_keys_to_service_templates_device_table (197)
PHP | 7.3.19-1~deb10u1
Python | 3.7.3
MySQL | 10.3.27-MariaDB-0+deb10u1
RRDTool | 1.7.1
SNMP | NET-SNMP 5.7.3
====================================
[OK] Composer Version: 2.0.9
[OK] Dependencies up-to-date.
[WARN] Debug enabled. This is a security risk.
[OK] Database connection successful
[OK] Database schema correct
[INFO] Detected Dispatcher Service
First Error,
Feb 3 08:19:58 servername librenms-service.py[553]: ValueError: <bound method Service.reap of <LibreNMS.service.Service object at 0x7f901578f748>> is not a valid Handlers
Feb 3 08:19:58 servername librenms-service.py[553]: During handling of the above exception, another exception occurred:
Feb 3 08:19:58 servername librenms-service.py[553]: Traceback (most recent call last):
Feb 3 08:19:58 servername librenms-service.py[553]: File "/opt/librenms/librenms-service.py", line 48, in <module>
Feb 3 08:19:58 servername librenms-service.py[553]: service.start()
Feb 3 08:19:58 servername librenms-service.py[553]: File "/opt/librenms/LibreNMS/service.py", line 404, in start
Feb 3 08:19:59 servername librenms-service.py[553]: sleep(self.config.master_resolution)
Feb 3 08:19:59 servername librenms-service.py[553]: File "/opt/librenms/LibreNMS/service.py", line 550, in reap
Feb 3 08:19:59 servername librenms-service.py[553]: handler = signal(SIGCHLD, SIG_DFL)
Feb 3 08:19:59 servername librenms-service.py[553]: File "/usr/lib/python3.7/signal.py", line 48, in signal
Feb 3 08:19:59 servername librenms-service.py[553]: return _int_to_enum(handler, Handlers)
Feb 3 08:19:59 servername librenms-service.py[553]: File "/usr/lib/python3.7/signal.py", line 30, in _int_to_enum
Feb 3 08:19:59 servername librenms-service.py[553]: return enum_klass(value)
Feb 3 08:19:59 servername librenms-service.py[553]: File "/usr/lib/python3.7/enum.py", line 310, in __call__
Feb 3 08:19:59 servername librenms-service.py[553]: return cls.__new__(cls, value)
Feb 3 08:19:59 servername librenms-service.py[553]: File "/usr/lib/python3.7/enum.py", line 564, in __new__
Feb 3 08:19:59 servername librenms-service.py[553]: raise exc
Feb 3 08:19:59 servername librenms-service.py[553]: File "/usr/lib/python3.7/enum.py", line 548, in __new__
Feb 3 08:19:59 servername librenms-service.py[553]: result = cls._missing_(value)
Feb 3 08:19:59 servername librenms-service.py[553]: File "/usr/lib/python3.7/enum.py", line 577, in _missing_
Feb 3 08:19:59 servername librenms-service.py[553]: raise ValueError("%r is not a valid %s" % (value, cls.__name__))
Feb 3 08:19:59 servername librenms-service.py[553]: TypeError: 'int' object is not callable
Second Error,
Feb 4 01:36:52 servername librenms-service.py[553]: ValueError: <bound method Service.reap of <LibreNMS.service.Service object at 0x7efd8bcbf6d8>> is not a valid Handlers
Feb 4 01:36:52 servername librenms-service.py[553]: During handling of the above exception, another exception occurred:
Feb 4 01:36:52 servername librenms-service.py[553]: Traceback (most recent call last):
Feb 4 01:36:52 servername librenms-service.py[553]: File "/opt/librenms/librenms-service.py", line 48, in <module>
Feb 4 01:36:52 servername librenms-service.py[553]: service.start()
Feb 4 01:36:52 servername librenms-service.py[553]: File "/opt/librenms/LibreNMS/service.py", line 395, in start
Feb 4 01:36:52 servername librenms-service.py[553]: self.dispatch_immediate_polling(device_id, group)
Feb 4 01:36:52 servername librenms-service.py[553]: File "/opt/librenms/LibreNMS/service.py", line 425, in dispatch_immediate_polling
Feb 4 01:36:52 servername librenms-service.py[553]: self.poller_manager.post_work(device_id, group)
Feb 4 01:36:52 servername librenms-service.py[553]: File "/opt/librenms/LibreNMS/queuemanager.py", line 83, in post_work
Feb 4 01:36:52 servername librenms-service.py[553]: self.get_queue(queue_id).put(payload)
Feb 4 01:36:52 servername librenms-service.py[553]: File "/opt/librenms/LibreNMS/__init__.py", line 321, in put
Feb 4 01:36:52 servername librenms-service.py[553]: self._redis.zadd(self.key, {item: time()}, nx=True)
Feb 4 01:36:52 servername librenms-service.py[553]: File "/usr/lib/python3/dist-packages/redis/client.py", line 2323, in zadd
Feb 4 01:36:52 servername librenms-service.py[553]: return self.execute_command('ZADD', name, *pieces, **options)
Feb 4 01:36:52 servername librenms-service.py[553]: File "/usr/lib/python3/dist-packages/redis/client.py", line 775, in execute_command
Feb 4 01:36:52 servername librenms-service.py[553]: return self.parse_response(connection, command_name, **options)
Feb 4 01:36:52 servername librenms-service.py[553]: File "/usr/lib/python3/dist-packages/redis/client.py", line 789, in parse_response
Feb 4 01:36:52 servername librenms-service.py[553]: response = connection.read_response()
Feb 4 01:36:52 servername librenms-service.py[553]: File "/usr/lib/python3/dist-packages/redis/connection.py", line 637, in read_response
Feb 4 01:36:52 servername librenms-service.py[553]: response = self._parser.read_response()
Feb 4 01:36:52 servername librenms-service.py[553]: File "/usr/lib/python3/dist-packages/redis/connection.py", line 290, in read_response
Feb 4 01:36:52 servername librenms-service.py[553]: response = self._buffer.readline()
Feb 4 01:36:52 servername librenms-service.py[553]: File "/usr/lib/python3/dist-packages/redis/connection.py", line 224, in readline
Feb 4 01:36:52 servername librenms-service.py[553]: self._read_from_socket()
Feb 4 01:36:52 servername librenms-service.py[553]: File "/usr/lib/python3/dist-packages/redis/connection.py", line 182, in _read_from_socket
Feb 4 01:36:52 servername librenms-service.py[553]: data = recv(self._sock, socket_read_size)
Feb 4 01:36:52 servername librenms-service.py[553]: File "/usr/lib/python3/dist-packages/redis/_compat.py", line 58, in recv
Feb 4 01:36:52 servername librenms-service.py[553]: return sock.recv(*args, **kwargs)
Feb 4 01:36:52 servername librenms-service.py[553]: File "/opt/librenms/LibreNMS/service.py", line 550, in reap
Feb 4 01:36:52 servername librenms-service.py[553]: handler = signal(SIGCHLD, SIG_DFL)
Feb 4 01:36:52 servername librenms-service.py[553]: File "/usr/lib/python3.7/signal.py", line 48, in signal
Feb 4 01:36:52 servername librenms-service.py[553]: return _int_to_enum(handler, Handlers)
Feb 4 01:36:52 servername librenms-service.py[553]: File "/usr/lib/python3.7/signal.py", line 30, in _int_to_enum
Feb 4 01:36:52 servername librenms-service.py[553]: return enum_klass(value)
Feb 4 01:36:52 servername librenms-service.py[553]: File "/usr/lib/python3.7/enum.py", line 310, in __call__
Feb 4 01:36:52 servername librenms-service.py[553]: return cls.__new__(cls, value)
Feb 4 01:36:52 servername librenms-service.py[553]: File "/usr/lib/python3.7/enum.py", line 564, in __new__
Feb 4 01:36:52 servername librenms-service.py[553]: raise exc
Feb 4 01:36:52 servername librenms-service.py[553]: File "/usr/lib/python3.7/enum.py", line 548, in __new__
Feb 4 01:36:52 servername librenms-service.py[553]: result = cls._missing_(value)
Feb 4 01:36:52 servername librenms-service.py[553]: File "/usr/lib/python3.7/enum.py", line 577, in _missing_
Feb 4 01:36:52 servername librenms-service.py[553]: raise ValueError("%r is not a valid %s" % (value, cls.__name__))
Feb 4 01:36:52 servername librenms-service.py[553]: TypeError: 'int' object is not callable