Hi, I am wondering if someone can give me a little kick in the right direction. I’ve been testing LibreNMS in the last week or so / getting it installed / and configured. So far it is moving along nicely, I got the proxmox, mysql, nginx app monitor add-in pieces working smoothly / agent is working / snmpd with ‘extend’ is working. So I have lots of data being harvested and it looks very nice.
One thing I’m having trouble figuring out how to do. Is to setup a notification-alert-trigger for something like
monitoring of a mysql server (ie, which has nothing to do with LibreNMS operation, rather this is a production mysqlDB server) for things such as:
mysql - slow query (ie, query running for >10min)
-mysql - excessive table locks (ie, above some threshold number)
or even something like
CPU load pinned 100% for >5min on mysql server or load > (Value) for (5-10min) for example.
I’ve looked through the docs and discussion I’ve found does not seem to be along these lines.
I have a feeling this is possible; but LibreNMS appears to be so versatile in config options that it is a little bit overwhelming as well. So …
I’m curious if anyone has experience with this that they are willing to share / or suggestions on how I can maybe get something like this working… ?
You could rework these ideas to retrieve slow queries information using a script and feed it back to metics in LibreNMS - you can then access those metrics in application_metrics table to generate alerts.
I’ve not used the mysql agent setup yet, but if it doesn’t handle slow queries - have a look at another application like docker in the list and see what the scripts do - they can be quite simple, though not sure if they have matching code to handle them within LibreNMS themselves - certainly the graphs do, but you may be able to get non-graphed data in to the database without much effort.
Can’t seem to access the correct doco for this lately, but on the device config you can add custom values which then appear in the customoids table:
I’ve used a delay before to prevent quick 95% port utilisation spikes from being alerted that only happen within a 5min window - I was only interested if they were sustained for longer - so this kept them quiet. In your case, the opposite would be true - should be able to get something working so you delay the 100% CPU alert by 5+ minutes so it will only let you know when it’s was triggered and maintained until the delay time expires.