Distributed poller - multiple Redis hosts

I’m switching to distributed poller (Redis) from an existing cron-based (memcached) deployment. I have a 3-master Redis cluster but it seems the dispatcher service expects a single host:port in the config? Any variants of passing multiple hosts do not seem to be supported:

REDIS_HOST=hostA:port,hostB:port,hostC:port 
REDIS_HOST=hostA,hostB,hostC

Looks like the service.py expects a single host:port endpoint.

Is everyone else using single Redis instance in their deployments or how do you benefit from Redis HA?

===========================================
Component | Version
--------- | -------
LibreNMS  | 23.1.0-19-gaa033ec3c (2023-02-02T13:07:59+00:00)
DB Schema | 2022_08_15_084507_add_rrd_type_to_wireless_sensors_table (248)
PHP       | 8.1.14
Python    | 3.9.13
Database  | MariaDB 10.3.36-MariaDB-log
RRDTool   | 1.7.0
SNMP      | 5.8
===========================================

Looking at the container poller deployment, perhaps the expectation is that every poller runs Redis locally?

I think it is a limitation of the predis laravel package that only allows one redis entry for a redis server or a sentinel server. I was never able to get it to work with redis sharding across multiple masters.

I ended up using HA proxy with a vip, and keepalived with one redis master and two replicas. The ha proxy vip always points you to the master redis server.

https://blogs.oracle.com/cloud-infrastructure/post/deploying-highly-available-redis-replication-with-haproxy-on-oracle-cloud-infrastructure

The problem I see now is that when the a replica is promoted to master all of the distributed pollers stop updating and I have to restart the librenms.service service on all of the systems.

After testing failovers you will end up with multiple librenms cluster masters until you restart every librenms service. I tried using the lirebnms watchdog but that didn’t fix anything, I still end up with zero devices polled and zero devices pending.

I’m seeing similar behaviour with a 3 Redis master setup - Sentinel replies with a ‘-MOVED ’ but the redis client does not follow/redirect/reconnect.

Edit
I was using python3.6 which pulls redis module v4.3.4. Switched to python 3.8 which can use redis==4.5.1 to no avail

I tried the 4.5.1 redis-py python package as well along with compiling redis 7.0.7 server and sentinel. It doesn’t make a difference.

Once the librenms service disconnects from the master during an outage… ie, replica promotion or sentinel says to move it stops and never tries to connect to the cache on the new master.

I thought the watchdog service was supposed to restart librenms service but that is not happening either.

I think the way it is written it uses connection.py and client.py from the redis-py distribution. The newer 4.5.1 redis-py has cluster.py and I think that would need to be involved to actually make use of sentinel moves or replica promotions.

1 Like

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.