I’ve written a custom Munin-style agent to collect data on job queues for HTCondor. When a submit node gets very busy, sometimes the command to get job statuses will time out. This takes much longer than the default 10s unix agent timeout.
When this timeout happens, a new munin_plugins_ds
entry is created with an empty string for ds_name
. Once this happens the RRD graphs fail, because it tries to construct a graph command with a malformed data source, DEF:=/path/to/munin_htcondor_.rrd
with that empty ds_name
string in the DEF (between the : and = where a real name should be).
I have written timeouts into the Munin script, and upped the unix-agent timeout in config.php, but it’d be nicer if the empty ds_name
entries weren’t created.