I wrote about this a while back here
I then wrote about how it impacts my extensive weathermap collection so heavily (my configs directory is almost 1 megabyte of plain text) I made a lazy script to fix them when IDs change here
The other place it plagues me is on aggregate dashboard port graph widgets, as they store the interfaces by ID in the database.
I’d never thought of the alert issue if referencing an ID, I don’t have any alerts like that. Possible solutions:
- Rework your alerts to reference device and port names instead of IDs
- Enter my dodgy world of hacks below …
When do IDs change for me?
They happen for me if a device has been offline for an extended period (weeks), eventually its interfaces disappear, and when it comes back online the next discovery creates them all again. I also see it on highly latent/lossy links when discovery gets interrupted - I’ll get random interface ID changes on some devices. In a previous life monitoring ISP cores and DCs and well connected things I never saw this.
Ideally the entire discovery process would have some transactional roll-back function if it doesn’t complete - but the amount of stuff it is doing and all the various differing modules and code involved would make that an incredibly difficult task and major rewrite I’d imagine.
In previous attempts to fix weathermaps, I was trying to find what the old and new ID was in the logs or database somewhere, but never got very far - as the system deletes the old interface during discovery, then creates a new one later, I think the information is in the event log for each device though.
I’m always trying to find a way to fix it, but currently just patching around it with all my dodgy scripting mostly!
Hacky method option:
If I had to solve this alert one if device/port name referencing was somehow not possible, I’d maybe put an interface/device signifier in the title/description and then do something like my 2nd link above regards weathermaps to use the database or API to parse them and check if the ID exists - if not, resolve it again from the title/description information to get the new ID and update the rule.
Not pretty, but if it’s important it’s probably worth the time and going to get you a solution before the discovery process evolves to prevent this.