I’ve been wanting to re-write our SNMP functions to use a class so I can mock them for tests easily. I tossed out just moving the existing code into a class, allowing us to avoid legacy code and keep things clean.
Here is my initial thoughts on an interface:
This will have built in caching, so we will never call snmpget/snmpwalk twice for the same data and the user doesn’t have to be aware if it is cached or not.
I think we can build a nicely parsed result for most use cases, but we leave the getRaw() and walkRaw() functions public for any oddball use cases.
I am not sure if returning an array or an object would be better. I am leaning towards object as it can be more obvious what to expect rather than having print_r() the array to understand what you get. I’ll add clear documentation either way so the result is expected.
I’d especially like to hear if there are thoughts on improving what I have so far or any issues that can be foreseen.
I believe I will be using object for supplying the data to callers.
This will allow the consumer to access the data as they desire without having to resort to SNMP::getRaw().
I will make a collection class LibreNMS\SNMP\DataSet
DataSet will have functions such as getByID(), so if you walk multiple related oids, you can group them by ID.
Anotther example is getByKey(), which will get all data that was walked under a certain key.
Individual pieces of data will be encapsulated in LibreNMS\SNMP\OID objects. Which will have functions such as getKey(), getID(), getData()
Hi @murrant,
I’d like to see a status flag that is checked before every query and set on a timeout.
The thought being that if SNMP on a device is unavailable, we should only timeout the connection once, then it is not tried again for that polling cycle.
Seems like a good path forward - especially the caching side.
I do have a question on that, not knowing enough about PHP and it’s memory allocation of variables, is it a bad thing if we cache 500 interfaces snmp data for all of the polling?
We can check to see how much memory it is. I would think it would be in the single digit megabytes. If using the Python dispatcher, shouldn’t the poller be run for only one device anyway?
Also, this is just the frontend. I plan to accommodate pluggable backends.
Yes poller-wrapper will only fire one poller.php but you may have 16 (or less / or more) firing so if poller.php ends up as 2MB then that’s 16MB just for polling - does that matter? Not everyone throws a lot of resources at things.
I’m not sure how I should handle errors such as unreachable, or oid not present, or when the connection gets interrupted mid fetch. Should I add an item to the collection that gets returned, set a property on the collection, throw an exception? I think for the oid not present, setting an error on the oid item (in the collection) seems reasonable. But I’m not sure what to do with the more generic errors.
Do exceptions stop the execution? If so that doesn’t seem the right way. Returning an error though seems sensible for oid not present as we want to see in debug that snmp didn’t return data we expect.
If a connection gets interrupted mid fetch, I’d say we should discard that data - I can’t see how having part data will be good.
Code is working for NetSNMP, and a bit of POC for PHP-SNMP.
Some testing is written, but needs to be more complete.
The calls all return Collections, which allows for some really nice data manipulation.
Specifically, a DataSet object containing OIDData object(s) (which are both subclasses of Collection) is returned.
Any frequent data manipulation patterns will be added to DataSet and maybe OIDData if appropriate.
OIDData contains many properties with the results of the snmp call.
Example, for core snmp polling this was our previous code:
First we get several variables in a single snmp get. We don’t have to worry if any or all of these failed.
We incorporate a workaround for a device bug and trim \ from the outside of all data values.
There is another rejection for devices with bad snmpEngineTime, which is essentially the same.
Finally, we get the highest uptime that hasn’t been removed. The seconds field is a convenience field that normalizes all time and other integer data that returns seconds.
Allow the cache time to be configurable by the user including completely disabling
poller and disco with either -v or -d should disable caching
we need to look at a way to display the debug data back in a nicer format potentially. Maybe we need to look at -d and -dd with the former just having a nicer output for users and -dd more for dev work with more verbose info).
Thanks for the comments. I finished implementing the enable/disable of the cache, it was there already, but not complete. $config['cache']['enable'] = false; or Config::set('cache.enable', false);
Cache time is at cache.time (it is in seconds)
Also, cache.dir controls the storage location in the case where file caching is used.
I’m not so sure about disabling the cache when debugging. How do you debug an issue that involves cached data? I’ve implemented it for now to test out.
Then we need a flag to show it’s cached and an option to not cache the data, I wouldn’t want to sit there hitting poller for 5 minutes seeing the same data whilst debugging / developing.