I have written some custom services for tracking player numbers on servers that support SourceQuery.
I have noticed that in certain conditions my check doesn’t return a response and just stays open forever. This is obviously bad design and I need to fix this.
However, it has revealed something that should probably be fixed in LibreNMS. Service checks currently have no timeout. The problem this has caused for me is that I end up with an increasing number of python service wrappers running, and eventually they fill up the RAM and swap and effectively take the server offline. In essence, the lack of timeout causes a sort of memory leak.
Whilst service checks should be written properly and not stay open forever, I think that LibreNMS should have a failsafe that kills service checks that are taking too long.