Zombie/Defunct process issue generated post OS upgrade of distributed pollers from Centos 7.9 to RHEL 8.9

I’m seeing similar issues with a poller that I added to an existing setup.

There was at some point a version mismatch since I didn’t check out a specific tag when adding the new one, but only that one has the problems. The mismatch was between monthly releases in 24 train so not sure if the database part makes a difference here.

So looks like for me it was caused by Graphite integration, the new poller didn’t have firewall permissions to connect so something was hanging causing the defunct processes. After I allowed the traffic I don’t see the defunct processes anymore!

1 Like

pm_max children is for php-fpm service, i haven’t increased it but i don’t think this could be the issue. Have you tried this though?

still no joy here for me, 8 defunct php process overnight after a restart of the librenms dispatcher. im just restarting the service every week or so. not ideal.

Is maintenance running? it should be restarting the process every night. (I don’t think it shows in the unit run time) Or maybe the way the maintenance restarts the process doesn’t clear zombies.

I am also still facing the issue, had to restart the librenms service every 5th day. Maintenance is definitely running as every night defunct process count gets increased. On Day1, when you restart the service, it will be around 70 defunct process in my case as the pollers polls many network devices and on day 2 it will be double and so on.

mine is less pronounced, not running distribute pollers. around 10 per day it increases see graph below.

happy to pull logs etc, but not sure what to look for at this point.