Add supervisord monitoring

We are interested in monitoring for supervisor. We would like monitoring on the state of the processes. That way we can add alerts if for instance a process goes in state “FATAL”.

How would I go about adding this to LibreNMS? I currently have the client code ready which outputs something like this:
{
“version”: 1,
“error”: 0,
“errorString”: 0,
“data”: {
“api_version”: “3.0”,
“supervisor_version”: “3.4.0”,
“processes”: [
{
“name”: “consumers-default_00”,
“group”: “consumers”,
“statename”: “RUNNING”,
“state”: 20,
“error”: null,
“start”: 1641897745,
“stop”: 1641897744,
“now”: 1641898092,
“uptime”: 347
},
{
“name”: “consumers-default_01”,
“group”: “consumers”,
“statename”: “RUNNING”,
“state”: 20,
“error”: null,
“start”: 1641897745,
“stop”: 1641897744,
“now”: 1641898092,
“uptime”: 347
}
]
}
}
The main issue I am facing atm is that the most interested part is the actual state. Ideally this is a string. I believe LibreNMS only stores numbers, so this is a problem.

I also wonder if it is possible to show instead of graph maybe a table? I haven’t come across examples of this.
Or what should this graph look like to clearly show the status of each process.

Thanks

Quickest way would be to write a script that accepts -H HOSTNAME args and then outputs in a nagios check script format and response code then use the service checks to call it.

These are the expected status codes the script should return:

[0 => ‘OK’, 1 => ‘Warning’, 3 => ‘Unknown’]

It can also record performance data from the output but you’ll need to check the code to understand how to use that.

Changed my approach a bit and created a PR.

This adds stats on the total processes and their status.
Also shows uptime per process.

This should give us enough data to add alerts on the important things.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.