I’ve had this problem as long as I can remember. Now using Ubuntu 22.04 Beta and LXD 5.
I pass through all my physical NICs to a Openwrt container. This works fine the first time I boot the host. But when I try to restart the openwrt container with “lxc restart”, some of the parent NICs get renamed to something seemingly random like phys****** and lxc fails to start as parent NIC does not exist.
The phys****** adapter does have the correct MAC and has a property “altname” which does have the real interface name.
Would it be possible to passthrough the NIC with MAC instead of Ifname? Or is there anything else I could do to stop this behavior? I tried disabling Predictable Network Interface Names, but with this the NICs get renamed phys****** during host boot and the container wont start once.
Is it always the same parent NICs that get renamed?
Are there any conflicting interfaces on the parent when the container gets restarted?
Please can you show the output of lxc config show <instance> --expanded along with the output of ip a before and after the container has been started and then restarted.
I think it’s mostly the same one, but it’s a part of 4-port Ethernet card so I’m not sure it matters. There should not be any conflicts I think, when everything is working as it should, the host has only one interface visible, br0 everything physical goes to the container.
I will get back to you with logs when I get a chance, it will take some doing because of my config.
I should add that this does not happen every time, but I’ve started to just reboot the whole host when needed.
So I had a look at the liblxc source code and found this:
I wonder if NIC is clashing with another NIC inside the container and then not being renamed so that when its moved back LXD doesn’t recognise it to rename it back to its original name.
Name: router
Status: STOPPED
Type: container
Architecture: x86_64
Created: 2021/11/03 23:19 EET
Last Used: 2022/04/08 12:35 EEST
Log:
lxc router 20220408101124.376 WARN network - network.c:lxc_delete_network_priv:3617 - Failed to rename interface with index 2 from "eth3" to its initial name "enp4s0f0"
lxc router 20220408101124.379 WARN network - network.c:lxc_delete_network_priv:3617 - Failed to rename interface with index 3 from "eth4" to its initial name "enp4s0f1"
When the container is stopped LXC will move the network device back to the host. In order to that it will use a “transient” name which it has used during interface creation. It’s basically a low-effort way to avoid name collisions on the host when moving a network device back that usually has a high-collision probability name such as “eth0” in the container.
In the final step it is renamed from the transient name to its original name on the host. Since the rename step fails after the device has been moved back it makes it somewhat likely that it’s a naming collision, i.e. it’s original hostname has been taken by another device.
I guess I can test later, but I will always have multiple nics in the container, it is a router/switch after all.
That collision thing seems probable. If I disable Predictable Network Interface Names, Host nics stay as eth0 etc. instead of enps, and then the container wont start at all.
Perhaps I could try renaming container nics to eth01 etc do avoid collision.
Ok, I did test this by removing all but one physical NIC from the container and adding them back one by one. The problems start when adding the third one of four.
No idea, I have Ubuntu server with minimal extra packages. Predictable Network Interface Names does this on boot of course, but like I said before, If I disable that the LXC container wont start even once and I have those phys*** nics listed when it has tried.
I’m using only systemd-networkd with only br0 configured if that makes any difference. And I compile lxd from source, I don’t have snap installed.