Add behaviours to "Scheduled Maintenance"

This is a feature request I would propably implement myself, but first I want to gather feedback here = is thise even okay/feasibile or what should be changed

First question: English or American english? I will go with “Behaviour” (british) for now, but I can change it, of course.

Current Behaviour

Scheduled maintenances allow to put devices, device groups and/or locations into maintenance for a certain period of time (start and end have to be defined) … although repeated maintenance is available as well. Anyway: During that time period, devices status is checked as usual, but any alert stuff is skipped. So existing alerts persist regardless of the situation, and new alerts won’t be triggered. This behaviour should be preserved and be the default one as well.

Desired Behaviours

Recently I wanted to add a feature for disabling the alert transport only per device (Feat/enable disabling of alert transporting by mwobst · Pull Request #17500 · librenms/librenms · GitHub). That was rejected due to being too specific, too narrow. The feature should rather be incorporated into scheduled maintenances. Thus I want to have a way to manipulate the behaviour of maintenances.

Behaviour description

  • “Skip alert rules checks” = current behaviour

  • “Disable alert transport” = alerts are checked as usual, but any transports are suppressed (essentially the muting feature present on alert rules)

  • “Information only” = objects affected by the maintenance are treated normally concerning alerts but are still “flagged” with a screwdriver at their overview page (like now)

Variant 1: Set behaviour at Scheduled Maintenance

This is probably easier. We would add the column “behaviour” to the table alert_schedules, with the following values

  • skip_alert_rule_checks = “Skip alert rules checks”

  • disable_alert_transport = “Disable alert transport” (basically muting alerts)

  • information_only = “Information only”

The last one will one add the screwdriver to the device page and nothing else. They first one would be the default at database level.

The GUI would simply add a dropdown menu having these three options, with “Skip alert rules checks” being selected when creating.

Variant 2: Set behaviour per object affected by a Scheduled Maintenance

Probably more difficult to implement, but allows more detailed management. The values remains the same as in variant 1, but the column is instead added to the table “alert_schedulables” which contains all items affected by maintenances. If, for example, there is something to “happen” within a building but only a certain amount of devices will/should be effected, you could put these devices into the “disable_alert_transport” section, the whole device group representing the building as “information only” (or the location if properly set). Then you would still receive notifications for all devices that should not have an outage, but might because unexpected things can happen.

The dialogue for adding/managing would change. “Map to” is basically multiplied, with the labels representing the behaviour:

My opinion

I would actually go with version 2 even though it is more difficult. But the main question: Is this to obscure for the typical user who doesn’t know anything about the underlying mechanism.

(We could live with version 1, I guess.)

Both variants need a hierarchy. Two different maintenances might affect the same device, so devices could be marked both with “disable_alert_transport” and “information only”, but the “disable_alert_transport” behaviour must take effect. “skip_alert_rule_checks” would have precedence over the other two, of course.

Thoughts about implementation

  • app/Models/Device.php => isUnderMaintenance() => probably add parameter $behaviour = 'skip_alert_rule_checks'. Function would filter device by “added column = passed value”. This part should also allow ‘all’ (or ‘any’?) as value.

  • LibreNMS/Alert/AlertUtil.php => isMaintenance($device_id) => add same parameter

  • LibreNMS/Alert/RunAlerts.php => the variable $noiss prevents alert transports if true. Should only happen when AlertUtil::isMaintenance($alert['device_id']) or AlertUtil::isMaintenance($alert['device_id'], 'disable_alert_transports') (so basically expand the condition on line 549 right now)

  • LibreNMS/Polling/ConnectivityHelper.php, function updateAvailability: $this->device->isUnderMaintenance() => call with ‘skip_alert_rule_checks’

  • app/Http/Controllers/Table/DeviceController.php, function formatItem => call $device->isUnderMaintenance() with ‘all’

  • resources/views/device/header.blade.php => call $device->isUnderMaintenance() with ‘all’

  • no changes to LibreNMS/Alert/AlertRules.php (or use ‘skip_alert_rule_checks’ for sake of explicitness)

Should I make use of LibreNMS/Enum/? Sth. like LibreNMS/Enum/AlertScheduleBehaviour.php:

<?php

namespace LibreNMS\Enum;

class AlertScheduleBehaviour
{
    const SKIP_ALERT_RULE_CHECKS = 'Skip alert rule check';
    const DISABLE_ALERT_TRANSPORTS = 'Disable Alert Transports';
    const INFORMATION_ONLY = 'Information only';
}

And finally …

What about documentation? There is no explicit “Device” documentation, sadly. What would be a fitting place? New markdown document?