HI Team,
I am using distributed poller setup with two pollers and Database servers running in Galera Cluster.
./validate.php
Component | Version |
---|---|
LibreNMS | 1.69-1-gbc02ab3f6 |
DB Schema | 2020_07_27_00522_alter_devices_snmp_algo_columns (188) |
PHP | 7.2.24-0ubuntu0.18.04.6 |
Python | 3.6.9 |
MySQL | 10.1.47-MariaDB-0ubuntu0.18.04.1 |
RRDTool | 1.7.0 |
SNMP | NET-SNMP 5.7.3 |
OpenSSL |
====================================
[OK] Composer Version: 1.10.22
[OK] Dependencies up-to-date.
[OK] Database connection successful
[OK] Database schema correct
[WARN] PHP version 7.3 is the minimum supported version as of November, 2020. We recommend you update PHP to a supported version (7.4 suggested) to continue to receive updates. If you do not update PHP, LibreNMS will continue to function but stop receiving bug fixes and updates.
[WARN] Your install is over 24 hours out of date, last update: Tue, 03 Nov 2020 01:56:49 +0000
[FIX]:
Make sure your daily.sh cron is running and run ./daily.sh by hand to see if there are any errors.
[WARN] Your local git branch is not master, this will prevent automatic updates.
[FIX]:
You can switch back to master with git checkout master
Issue
From yesterday, our Librenms stop sending alerts via Slack. I have tested Alert Transports and I do receive test alerts. I have also tried running ./alert.php but don’t get an output. However, when i add debug to it, it only shows the following
DEBUG!
SQL[update cache_locks
set owner
= ?, expiration
= ? where key
= ? and (owner
= ? or expiration
<= ?) [“BNcin1MMPQPHO9AE”,1619730704,“laravel_cachealerts”,“BNcin1MMPQPHO9AE”,1619644304] 0.48ms]
======================
I have also tested both test-alert.php and when i run it, I do receive alerts in my slack channels.
In addition to that I have also tested “test-template.php” and the following is the output of that command
SQL[select * from devices
where hostname
= ? limit 1 [“10.1.96.196”] 0.75ms]
SQL[SELECT alerts.id, alerts.alerted, alerts.device_id, alerts.rule_id, alerts.state, alerts.note, alerts.info FROM alerts WHERE alerts.device_id=358 && alerts.rule_id=76 [] 0.49ms]
SQL[SELECT alert_log.id,alert_log.rule_id,alert_log.device_id,alert_log.state,alert_log.details,alert_log.time_logged,alert_rules.rule,alert_rules.severity,alert_rules.extra,alert_rules.name,alert_rules.query,alert_rules.builder,alert_rules.proc FROM alert_log,alert_rules WHERE alert_log.rule_id = alert_rules.id && alert_log.device_id = ? && alert_log.rule_id = ? && alert_rules.disabled = 0 ORDER BY alert_log.id DESC LIMIT 1 [358,76] 0.46ms]
SQL[SELECT DISTINCT a.* FROM alert_rules a
LEFT JOIN alert_device_map d ON a.id=d.rule_id AND (a.invert_map = 0 OR a.invert_map = 1 AND d.device_id = ?)
LEFT JOIN alert_group_map g ON a.id=g.rule_id AND (a.invert_map = 0 OR a.invert_map = 1 AND g.group_id IN (SELECT DISTINCT device_group_id FROM device_group_device WHERE device_id = ?))
LEFT JOIN alert_location_map l ON a.id=l.rule_id AND (a.invert_map = 0 OR a.invert_map = 1 AND l.location_id IN (SELECT DISTINCT location_id FROM devices WHERE device_id = ?))
LEFT JOIN device_group_device dg ON g.group_id=dg.device_group_id AND dg.device_id = ?
WHERE a.disabled = 0 AND (
(d.device_id IS NULL AND g.group_id IS NULL)
OR (a.invert_map = 0 AND (d.device_id=? OR dg.device_id=?))
OR (a.invert_map = 1 AND (d.device_id != ? OR d.device_id IS NULL) AND (dg.device_id != ? OR dg.device_id IS NULL))
) [358,358,358,358,358,358,358,358] 0.63ms]
SQL[SELECT hostname, sysName, sysDescr, sysContact, os, type, ip, hardware, version, purpose, notes, uptime, status, status_reason, locations.location FROM devices LEFT JOIN locations ON locations.id = devices.location_id WHERE device_id = ? [358] 0.44ms]
SQL[select * from devices_attribs
where devices_attribs
.device_id
= ? and devices_attribs
.device_id
is not null [358] 0.38ms]
SQL[select * from device_perf
where device_id
= ? order by timestamp
desc limit 1 [358] 0.41ms]
SQL[select * from alert_templates
where exists (select * from alert_template_map
where alert_templates
.id
= alert_template_map
.alert_templates_id
and alert_rule_id
= ?) limit 1 [76] 0.46ms]
SQL[select * from alert_templates
where name
= ? limit 1 [“Default Alert Template”] 0.44ms]
Array
(
[hostname] => 10.1.96.196
[sysName] => vik-jumpbox
[sysDescr] => Hardware: Intel64 Family 6 Model 37 Stepping 1 AT/AT COMPATIBLE - Software: Windows Version 6.3 (Build 19042 Multiprocessor Free)
[sysContact] =>
[os] => windows
[type] => server
[ip] =>
[hardware] => Intel x64
[version] => 10 (NT 6.3)
[serial] =>
[features] =>
[location] =>
[uptime] => 795
[uptime_short] => 13m 15s
[uptime_long] => 13 minutes 15 seconds
[description] =>
[notes] =>
[alert_notes] =>
[device_id] => 358
[rule_id] => 76
[id] => 35149
[proc] =>
[status] => 0
[status_reason] => icmp
[ping_timestamp] =>
[ping_loss] => 100
[ping_min] => 0
[ping_max] => 0
[ping_avg] => 0
[debug] => Array
(
)
[title] => Alert for device 10.1.96.196 - P3_ASE_Servers_Device Down! Due to no ICMP response.
[faults] => Array
(
[1] => Array
(
[device_id] => 358
[inserted] => 2021-04-28 20:30:34
[hostname] => 10.1.96.196
[sysName] => vik-jumpbox
[ip] =>
[overwrite_ip] =>
[community] => public
[authlevel] =>
[authname] =>
[authpass] =>
[authalgo] =>
[cryptopass] =>
[cryptoalgo] =>
[snmpver] => v2c
[port] => 161
[transport] => udp
[timeout] =>
[retries] =>
[snmp_disable] => 0
[bgpLocalAs] =>
[sysObjectID] => .1.3.6.1.4.1.311.1.1.3.1.1
[sysDescr] => Hardware: Intel64 Family 6 Model 37 Stepping 1 AT/AT COMPATIBLE - Software: Windows Version 6.3 (Build 19042 Multiprocessor Free)
[sysContact] =>
[version] => 10 (NT 6.3)
[hardware] => Intel x64
[features] => Multiprocessor
[location_id] =>
[os] => windows
[status] => 0
[status_reason] => icmp
[ignore] => 0
[disabled] => 0
[uptime] => 795
[agent_uptime] => 0
[last_polled] => 2021-04-29 06:55:26
[last_poll_attempted] =>
[last_polled_timetaken] => 2.7
[last_discovered_timetaken] => 3.06
[last_discovered] => 2021-04-29 06:53:10
[last_ping] => 2021-04-29 06:55:26
[last_ping_timetaken] => 0.95
[purpose] =>
[type] => server
[serial] =>
[icon] => windows.svg
[poller_group] => 6
[override_sysLocation] => 0
[notes] =>
[port_association_mode] => 1
[max_depth] => 0
[disable_notify] => 0
[string] => sysObjectID = .1.3.6.1.4.1.311.1.1.3.1.1; sysDescr = Hardware: Intel64 Family 6 Model 37 Stepping 1 AT/AT COMPATIBLE - Software: Windows Version 6.3 (Build 19042 Multiprocessor Free);
)
)
[elapsed] => 1m 56s
[builder] => {"condition":"AND","rules":[{"id":"macros.device_down","field":"macros.device_down","type":"integer","input":"radio","operator":"equal","value":"1"},{"id":"devices.status_reason","field":"devices.status_reason","type":"string","input":"text","operator":"equal","value":"icmp"}],"valid":true}
[uid] => 35149
[alert_id] => 4558
[severity] => critical
[rule] =>
[name] => P3_ASE_Servers_Device Down! Due to no ICMP response.
[timestamp] => 2021-04-29 07:00:26
[contacts] => Array
(
)
[state] => 1
[alerted] => 0
[transport] => slack
[msg] => Alert for device 10.1.96.196 - P3_ASE_Servers_Device Down! Due to no ICMP response.
Severity: critical
Timestamp: 2021-04-29 07:00:26
Unique-ID: 35149
Rule: P3_ASE_Servers_Device Down! Due to no ICMP response. Faults:
1: sysObjectID = .1.3.6.1.4.1.311.1.1.3.1.1; sysDescr = Hardware: Intel64 Family 6 Model 37 Stepping 1 AT/AT COMPATIBLE - Software: Windows Version 6.3 (Build 19042 Multiprocessor Free);
Alert sent to:
)
etc/cron.d/librenms ----> Master poller
Using this cron file requires an additional user on your system, please see install docs.
33 */6 * * * librenms /opt/librenms/cronic /opt/librenms/discovery-wrapper.py 1
*/5 * * * * librenms /opt/librenms/discovery.php -h new >> /dev/null 2>&1
*/5 * * * * librenms /opt/librenms/cronic /opt/librenms/poller-wrapper.py 16
-
-
-
-
- librenms /opt/librenms/alerts.php >> /dev/null 2>&1
*/5 * * * * librenms /opt/librenms/poll-billing.php >> /dev/null 2>&1
01 * * * * librenms /opt/librenms/billing-calculate.php >> /dev/null 2>&1
*/5 * * * * librenms /opt/librenms/check-services.php >> /dev/null 2>&1
- librenms /opt/librenms/alerts.php >> /dev/null 2>&1
-
-
-
Daily maintenance script. DO NOT DISABLE!
If you want to modify updates:
Switch to monthly stable release: https://docs.librenms.org/General/Releases/
Disable updates: https://docs.librenms.org/General/Updating/
15 0 * * * librenms /opt/librenms/daily.sh >> /dev/null 2>&1
etc/cron.d/librenms ----> Second poller
----> WHile troubleshooting i have commented alert cron job on second poller
Using this cron file requires an additional user on your system, please see install docs.
33 */6 * * * librenms /opt/librenms/cronic /opt/librenms/discovery-wrapper.py 1
*/5 * * * * librenms /opt/librenms/discovery.php -h new >> /dev/null 2>&1
/5 * * * * librenms /opt/librenms/cronic /opt/librenms/poller-wrapper.py 16
# * * * * librenms /opt/librenms/alerts.php >> /dev/null 2>&1
*/5 * * * * librenms /opt/librenms/poll-billing.php >> /dev/null 2>&1
01 * * * * librenms /opt/librenms/billing-calculate.php >> /dev/null 2>&1
*/5 * * * * librenms /opt/librenms/check-services.php >> /dev/null 2>&1
Daily maintenance script. DO NOT DISABLE!
If you want to modify updates:
Switch to monthly stable release: https://docs.librenms.org/General/Releases/
Disable updates: https://docs.librenms.org/General/Updating/
15 0 * * * librenms /opt/librenms/daily.sh >> /dev/null 2>&1
Regards,
Vik