Ninja for hire?

Not sure this is the best place - but I am wondering if there are any LibreNMS ninja’s that are available for hire? Basically I’m a company that has Librenms running, but I don’t have the time to dig deep and tweak it out fully.

Just wondering if there’s anyone here willing to get paid and help? Or if there are folks that are truly ninja’s that do stuff on the side?

Things I need:

cool latency measurements between my sites (Smokeping)?
Backup of configs for switches (Extreme) Oxidized?
Ability to poll my Dell DRAC’s for SMART disk drive stuff. (I can poll DRAC’s today, just don’t get the Smart info…drives fail…and I don’t know it)
Fixing my existing monitoring of certain devices (most work, just a few things missing):
Extreme switches poll, bandwidth obverved and graphed, but CPU and memory aren’t.
Windows WMI measurements for SQL, Active Directory, stuff like that. (IE: Deeper than just CPU and memory)

Overall just someone passionate about it and would set things up to the hilt. I’ll pay, that’s not the issue. (I’m not moneybags, but I can certainly pay)

If this is the wrong place, please pardon and point me to the right place.

2 Likes

I got no replies from anyone. Sort of strange. Is anyone interested?

None of the devs do paid for work at present, this may change but it’s not happened right now.

As this is a community project, you are more than welcome to open feature requests in https://community.librenms.org/c/feature-requests for each one you require.

well…I don’t reallly need help with more features. I’m just trying to get existing ones to work. And competing priorities make it difficult to have to work for the answers. I got it, it’s a open source project. People contribute for the better of everyone else. And of course the cost is great. I’m just saying…I don’t care about the cost, I need to get X solved quickly.

Take for example the new SMART features to see if a drive is failing, alert on it, etc. It’s not at all obvious how to really get that working. I’ve enabled in the applications area, rediscovered the devices. (I’ve tried DRAC’s and VMWare boxes) and I get nada. I could troubleshoot it if the basics of troubleshooting the app was obvious.

So I guess any ideas on SMART and basically how to find out if a hard drive is failing, has failed, etc. so we can alert on it and correct before real bad things happen?

1 Like

SMART reporting is definitely a valuable feature, but I’ll caution that SMART codes are not consistent between vendors. I think that a standardized “healthy-failing-dead” approach to monitoring could be developed but it will require a little bit of footwork to get all of the various vendor codes to line up. For instance, health reporting for drive temps are not standardized, it could be Celsius or Fahrenheit depending on the drive vendor. So if it a drive reports 100, does it mean it’s 100 F or 100 C? The first one is merely warm (and normal), the second can boil water. Deriving a failure state may require a bit more logic and/or data to be involved, so that the correct scaling and numbers are used for upper bound alerting.

OK…i understand that SMART can throw off strange data, but basically I’m trying to understand how to even enable polling for SMART type errors? I see that it’s a new “application” in Librenms to turn on, but I’m unclear on what to do, config, or otherwise do once I “turn it on”. Is there some other config in other areas?

I’ll say in advance I just don’t see anything on the feature, that doesn’t mean I’m looking in the right place on how to enable and actually use it.

Here is everything you need to know, to enable the SMART application: http://docs.librenms.org/Extensions/Applications/#smart

thank you. It’s now evident that in this context, Smart monitoring is only applicable to monitoring a Linux box by running a script to extend SNMP on said box.

Trying to get SMART data from a DRAC in this case will be like squeezing blood from a turnip.

I guess the only way would be if your DRAC exposes SMART data via SNMP itself. If not, then I’m afraid you’re out of luck.

We have drac disk states in already:

                    array($state_index_id,'unknown',0,1,3) ,
                    array($state_index_id,'online',0,2,0) ,
                    array($state_index_id,'failed',0,3,2) ,
                    array($state_index_id,'degraded',0,4,1)

Might not be smart info but will tell you some status.