Do port IDs change for the same device in Libre?

I am almost certain that I was looking at a specific interface on a switch and the port ID was 1974, then I went back (a day later) to check another setting and noticed that the port ID was 2010. I wrote down 1974 the day before so I was either in the wrong switch (possible, I guess) or the port ID changed.

I don’t know how to look up a port ID, is there a way to list all the port IDs and which device they are tied to?

Obviously this is a bit concerning (if they change) because if I’m monitoring a port via port ID and it changes, I may not get an alert for that port being down.

Thanks.

I wrote about this a while back here

I then wrote about how it impacts my extensive weathermap collection so heavily (my configs directory is almost 1 megabyte of plain text) I made a lazy script to fix them when IDs change here

The other place it plagues me is on aggregate dashboard port graph widgets, as they store the interfaces by ID in the database.

I’d never thought of the alert issue if referencing an ID, I don’t have any alerts like that. Possible solutions:

  1. Rework your alerts to reference device and port names instead of IDs
  2. Enter my dodgy world of hacks below …

When do IDs change for me?
They happen for me if a device has been offline for an extended period (weeks), eventually its interfaces disappear, and when it comes back online the next discovery creates them all again. I also see it on highly latent/lossy links when discovery gets interrupted - I’ll get random interface ID changes on some devices. In a previous life monitoring ISP cores and DCs and well connected things I never saw this.

Ideally the entire discovery process would have some transactional roll-back function if it doesn’t complete - but the amount of stuff it is doing and all the various differing modules and code involved would make that an incredibly difficult task and major rewrite I’d imagine.

In previous attempts to fix weathermaps, I was trying to find what the old and new ID was in the logs or database somewhere, but never got very far - as the system deletes the old interface during discovery, then creates a new one later, I think the information is in the event log for each device though.

I’m always trying to find a way to fix it, but currently just patching around it with all my dodgy scripting mostly!

Hacky method option:
If I had to solve this alert one if device/port name referencing was somehow not possible, I’d maybe put an interface/device signifier in the title/description and then do something like my 2nd link above regards weathermaps to use the database or API to parse them and check if the ID exists - if not, resolve it again from the title/description information to get the new ID and update the rule.

Not pretty, but if it’s important it’s probably worth the time and going to get you a solution before the discovery process evolves to prevent this.

On Cisco I know we used to enable a feature to get ifIndex to persist between reboots etc…

I don’t use Cisco currently, but thought this might be worth checking out
router(config)# snmp-server ifindex persist

@rhinoau rhinoau

Do you know how I can see all devices and all port IDs (easily)?

Right now, the only way I know how to get the port ID is to click the interface and drill down to a graph and click show RRD graph to see the command showing the port ID in the code/text.

I do it by rolling over the interfaces on the Ports tab and looking at the bottom left the browser status bar (in Chrome) for the link.

Or the Database:

MariaDB [librenms]> select d.hostname, p.port_id,p.ifName from ports p, devices d where p.device_id = d.device_id;

image

… or the API

1 Like

Yeah that helps with some random when modules get changes/added, overlay/virtual interfaces come and go etc, but most of my issues are discovery breaking and reindexing things on the LibreNMS side from what I can see - and affects any vendor.

Hovering over the interfaces and looking at the bottom left…that would have saved me so much time, yesterday. That is very helpful, thanks.

I’m going to try that command in MariaDB, again, thank you.

Not sure about the API suggestion, that is over my head.

It is unfortunate that the port IDs change or can change. Once a device is in libre, it should keep the same port ID. I’m sure there is a good reason that it doesn’t as I’m not a programmer/developer and don’t know much about what is going on behind the scenes. I’m thankful for what I can do with libre.

Right now, my alerts are configured at the device level and for the specific port. I’m only monitoring specific uplink ports.

How can I create an alert or how should I create an alert to avoid using the port ID and use something that will never change/that I have more control over?

The thread I’m linking is how I’m currently creating a ‘port down’ alert.

Then as I was reading about port IDs (which is what I’m currently using) I noticed that they can possibly change.

Thanks.

Wow, that was a lot of lines of data. Can you provide a command that I can use where I simply list the single device (switch IP/hostname) and list those ports with port IDs? That way I can run the command on the few switches I need and can edit the IP/hostname when needed.

Thanks.

Add the hostname you want to filter on the end:

select d.hostname, p.port_id,p.ifName from ports p, devices d where p.device_id = d.device_id and d.hostname = "MY_DEVICE_1";

Or for multiple hosts in one line:

select d.hostname, p.port_id,p.ifName from ports p, devices d where p.device_id = d.device_id and (d.hostname = "MY_DEVICE_1" or d.hostname = "MY_DEVICE_2");

1 Like

I had a rule enabled that is referencing macros.port_down and ports.port_id and that is working fine. When I unplug the test device after the amount of time referenced in the rule and taking snmp polling time into consideration, the rule is active and I get the email.

I paused that rule and created a new one with macros.port_down and removed ports.port.id and replaced it with ports.port_desc_desc which I assume is port description. I grabbed the port description that is set on my test port and pasted that value into the new rule. I unplugged the device and I have not gotten an email for the alert. Yes, the transports value is correct. It seems that the combo I’m using is not correct or possibly this switch doesn’t understand/poll whatever librenms is using for ports.port_desc_desc.

Do you have any recommendations/suggestions on how I can get this working with port description? That field won’t change if the switch is re-indexed/discovered/etc and librenms changes the port IDs.

Thanks.

Edit- I see ports.portName I will try that instead of the desc option. I guess another question is…how do I know if I should use portName or port_desc?

Edit 2- ports.portName did not trigger an alert.

Yes, you could use ifDescr (what you see in the UI), or ifName - from the database queries earlier which are generally the shorter form, either would work.

I don’t have an interactive way to test currently, but to prove the point, you could use either as below which seems to work for me:

Alert triggered:

All you should really need is this:

You can test on the command line too, you can get the rule ID by hovering over the icon to the left of it on the Alert Rules screen, then in my case, rule 53 testing against device 189:

~$ scripts/test-alert.php -r 53 -h 189

You’ll either get No active alert found or Issuing Alert-UID xxxxx/x: with all the transport information.

1 Like

Thank you, I will try this and update. I’m still not clear on how to figure out which macros work with with ports options. I don’t know if you know from trial and error or if you are able to validate some other way, regardless, if it works, it looks like it is a better option than port IDs since the port IDs can change.

Thanks for all of the replies, it really is appreciated.

Thanks again, the rule with only the port down macro and ports ifDescr worked. I may go that route instead of using port IDs.

Edit- I tried another test and the rule doesn’t seem to be working. I did not change anything for the rule, which makes this seem very odd. Interesting.

Working on this today and I am able to get the specific data for the specific device (and devices, I tried both). However, I wanted to modify the command to add the port description to make sure the db lines up with what I’m using on the device side. I know what I’m using on the device side, but running the command with the port description added would just confirm that the data I’m using is what the db sees/is using.

your original command

select d.hostname, p.port_id,p.ifName from ports p, devices d where p.device_id = d.device_id and (d.hostname = “MY_DEVICE_1” or d.hostname = “MY_DEVICE_2”);

my edited command

select d.hostname, p.port_id, p.ifName, p.ifDescr from ports p, devices d where p.device_id = d.device_id and (d.hostname = “MY_DEVICE_1” or d.hostname = “MY_DEVICE_2”);

The edited command did give me an additional column titled ifDescr, but the data I see in that column doesn’t match the custom/user defined description of the port I’m testing. For example, I added the description of ‘librenms test description port’ and the output shows ‘Slot:0 Port: 33 Gigabit - Level’

Is it possible to see the custom/user defined data in that column?

Thanks.

This command does work. If the rule isn’t in an alerted state then I don’t see anything when I press enter, it just takes me to a blank line in the console. If there is an active alert I do see the output 'Issuing Alert-UID xxxxx/x: with the transport information and I receive an email.

Meaning, I don’t see ‘No active alert found’ if there isn’t an active alert.

ifDescr is the hard-coded description from the device/vendor, if you want to match on your user-defined string, you probably want to use ifAlias.

1 Like

Interesting. I don’t know why that rule worked one time but now is not working. I did modify the rule and added a 1 in the port description field and I updated the port description to reflect the added 1 and I could not get it to work again. I’ll try the Alias option since that’s what the db should be referencing. Very odd that it worked 1 time and nothing after that.

Changing the rule to ifAlias does work with the description I’ve set. I’ve done a few tests and so far it works great. I have no clue why the description worked one time and never again after my first test.

Also, I see what you mean with the ‘No active alert found,…’ output, I am now seeing that when I test a rule entry on another rule. Ironically, the rule I was testing it was the rule that was using Description (now using Alias). Maybe something was broken/not configured properly with that rule because it didn’t give me output in the console when I referenced that rule number. Since switching the rule to Alias, I now get correct output. Strange…I don’t know what happened, but all seems to be working, now.

Thanks.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.