vethXXXXX interfaces are not removed when lxc container is stopped

Hallo! I am now observing strange behaviour in lxc containers networking. I use my custom bridge device (br0) to provide network connection as follows:

    lxc.net.0.type = veth
    lxc.net.0.name = eth0
    lxc.net.0.hwaddr = 00:16:3e:6e:13:a9
    lxc.net.0.link = br0
    lxc.net.0.ipv4.address = 10.194.15.7/16
    lxc.net.0.ipv4.gateway = 10.194.99.70
    lxc.net.0.ipv6.address = 2001:db8:0:300::15:7/64
    lxc.net.0.ipv6.gateway = 2001:db8:0:300::70
    lxc.net.0.flags = up

When container is started, vethXXXXXX interface is created and added to br0 bridge.
So far so good.
When I stop the container, this interface is not deleted and on next startup a new one is created. After that weird things start to happen: duplicate ipv6 ip addresses are reported, when container is stopped it’s ipv6 address can still be pinged, though nmap does not discover any port on it. Connectivity to container via ipv6 is lost after such restart.
When I manually identify and remove (using brctl delif and ip link del commands) all leftover interfaces, container starts working properly.

I am using lxc 3.1.0+really3.0.3-8 version from debian sid.

Best regards,

That sounds like a kernel bug.

Normally when the last process in a container dies, the kernel destroys all the namespaces, including the network namespace which contains the container side interface of the veth pair used to give your container connectivity.

In your case, something is keeping that network namespace active, which then keeps that veth device active in the container including its IP address.

Unfortunately you deleting the host side device just papers over the issue, you’re still leaking kernel resources in the background with little you can really do to track and fix this.

We have planned kernel work which should make it easier to identify such issues in the future.

ok seems I have same problem.

Having same issue on latest LXD on Arch Linux. I’ve stopped all the containers but still have a ton of veth* interfaces from stopped (and even removed) containers

What LXD version? What is the instance config showing by lxc config show <instance> --expanded?
How are you stopping the instance?

… and which kernel version do you have? Couldn’t you also provide us with the ps aux output from the host?

I’m sorry, I cleared all dangling interfaces manually and can no logner reproduce it for now. I will post any required info when it will happen again. Aa I mentioned, I use Arch Linux with newest version of kernel (6.3.2) and LXD (5.13)

I was having a similar issue, and after chasing down a number of dead ends, lxc kept leaving the old devices lying behind and worse, I couldn’t stop network manager from picking up the veth device and managing it (which then stuffed up networking for all other containers).

I kept seeing this entry in syslog:

lxd.daemon[2941]: time="...." level=error msg="Failed to stop device" device=eth0 err="Failed clearing netprio rules for instance \"default\" in project \"nickg-test-platform-test-admin\": device name is empty" instance=nickg-test-platform-test-admin instanceType=container project=default

After this, NetworkManager would start managing the device and things went downhill from there

It would be nice if lxd could actually delete these and clean them up, but in the meantime I’ve given up and added some config to tell NetworkManager to leave the device alone:

[keyfile]
unmanaged-devices=interface-name:veth*

I then periodically run a script to clean up all the old veth devices that lxc has left lying behind.

Please can you see if you can get the output of lxc config show <instance> --expanded just before you shutdown the instance when you get that error.

This is the instance config…

architecture: x86_64
config:
  image.architecture: amd64
  image.description: Ubuntu jammy amd64 (20231029_07:42)
  image.name: ubuntu-jammy-amd64-default-20231029_07:42
  image.os: ubuntu
  image.release: jammy
  image.serial: "20231029_07:42"
  image.variant: default
  security.nesting: "true"
  security.privileged: "true"
  user.keypair_name: nickg-test-platform-test
  user.runner_name: platform-test
  user.runner_state: builder
  volatile.apply_template: create
  volatile.base_image: 066eabae579418811170141ac97cacb3587ce88b947f5c217f5e035b530e57e9
  volatile.cloud-init.instance-id: 54327f9b-7c88-4a9e-acaf-6ed4987314c5
  volatile.eth0.hwaddr: 00:16:3e:4d:2c:e4
  volatile.idmap.base: "0"
  volatile.idmap.current: '[]'
  volatile.idmap.next: '[]'
  volatile.last_state.idmap: '[]'
  volatile.uuid: 17025a9f-6fcc-44f0-b0ed-9a4d4f60906d
  volatile.uuid.generation: 17025a9f-6fcc-44f0-b0ed-9a4d4f60906d
devices:
  eth0:
    name: eth0
    network: lxdbr0
    type: nic
  root:
    path: /
    pool: docker
    type: disk
ephemeral: false
profiles:
- default
stateful: false
description: ""

And here’s what’s in syslog right after the container is stopped:

ov  1 09:45:01 nickg-lp CRON[1471700]: (root) CMD (/usr/local/bin/lldp2facts)
Nov  1 09:45:03 nickg-lp systemd[3457]: Started snap.lxd.lxc-42ad6bef-0966-42d5-9644-2b33bd44563a.scope.
Nov  1 09:45:03 nickg-lp systemd[3457]: snap.lxd.lxc-42ad6bef-0966-42d5-9644-2b33bd44563a.scope: Succeeded.
Nov  1 09:45:47 nickg-lp kernel: [56810.622022] physkmbr73: renamed from eth0
Nov  1 09:45:47 nickg-lp NetworkManager[674263]: <info>  [1698785147.8496] manager: (eth0): new Veth device (/org/freedesktop/NetworkManager/Devices/125)
Nov  1 09:45:47 nickg-lp NetworkManager[674263]: <info>  [1698785147.8563] device (eth0): interface index 256 renamed iface from 'eth0' to 'physkmbr73'
Nov  1 09:45:47 nickg-lp systemd-udevd[1472493]: ethtool: autonegotiation is unset or enabled, the speed and duplex are not writable.
Nov  1 09:45:47 nickg-lp systemd-udevd[1472493]: ethtool: could not get ethtool features for eth0
Nov  1 09:45:47 nickg-lp systemd-udevd[1472493]: Could not set offload features of eth0: No such device
Nov  1 09:45:47 nickg-lp NetworkManager[674263]: <info>  [1698785147.8778] device (physkmbr73): interface index 256 renamed iface from 'physkmbr73' to 'vetha3a93bb0'
Nov  1 09:45:47 nickg-lp systemd-udevd[1472493]: ethtool: autonegotiation is unset or enabled, the speed and duplex are not writable.
Nov  1 09:45:47 nickg-lp lxd.daemon[2941]: time="2023-11-01T09:45:47+13:00" level=error msg="Failed to stop device" device=eth0 err="Failed clearing netprio rules for instance \"default\" in project \"nickg-test-platform-test-image-build\": device name is empty" instance=nickg-test-platform-test-image-build instanceType=container project=default
Nov  1 09:45:47 nickg-lp systemd-udevd[1472493]: ethtool: could not get ethtool features for eth0
Nov  1 09:45:47 nickg-lp systemd-udevd[1472493]: Could not set offload features of eth0: No such device
Nov  1 09:45:47 nickg-lp systemd-udevd[1472493]: ethtool: autonegotiation is unset or enabled, the speed and duplex are not writable.
Nov  1 09:45:47 nickg-lp systemd-udevd[1472493]: ethtool: could not get ethtool features for physkmbr73
Nov  1 09:45:47 nickg-lp systemd-udevd[1472493]: Could not set offload features of physkmbr73: No such device
Nov  1 09:45:47 nickg-lp systemd-udevd[1472493]: ethtool: autonegotiation is unset or enabled, the speed and duplex are not writable.
Nov  1 09:45:47 nickg-lp systemd-udevd[1472493]: Using default interface naming scheme 'v245'.
Nov  1 09:45:48 nickg-lp kernel: [56811.507470] kauditd_printk_skb: 1 callbacks suppressed
Nov  1 09:45:48 nickg-lp kernel: [56811.507473] audit: type=1400 audit(1698785148.704:2487): apparmor="STATUS" operation="profile_remove" profile="unconfined" name="lxd-nickg-test-platform-test-image-build_</var/snap/lxd/common/lxd>" pid=1472560 comm="apparmor_parser"

Note that I’ve told NetworkManager to leave all veth devices alone, but without that config change to network manager I’d see dhcp starting up setting up routes etc.

Also, here’s the two relevant spammy entries I see in ip addr:

256: vetha3a93bb0@veth37f4aa29: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 00:16:3e:4d:2c:e4 brd ff:ff:ff:ff:ff:ff
257: veth37f4aa29@vetha3a93bb0: <NO-CARRIER,BROADCAST,MULTICAST,UP,M-DOWN> mtu 1500 qdisc noqueue master lxdbr0 state LOWERLAYERDOWN group default qlen 1000
    link/ether aa:1e:ae:ea:f0:75 brd ff:ff:ff:ff:ff:ff

And here’s my lxd bridge:

94: lxdbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 00:16:3e:9b:14:aa brd ff:ff:ff:ff:ff:ff
    inet 192.168.110.1/24 scope global lxdbr0
       valid_lft forever preferred_lft forever

Its strange you dont see a volatile.eth0.host_name setting, as this records the host side name of the veth interface used at start time. This is then used as part of the clean up, and it its missing, as it is in this case, then you will get cleanup issues and errors.

Can you launch a fresh instance and see if that volatile key appears and whether it gets removed before stopping the instance (as it should be removed during the instance stop process).

I’ve just launched a fresh one, there’s a hardware address, but no host_name

architecture: x86_64
config:
  image.architecture: amd64
  image.description: Ubuntu jammy amd64 (20231029_07:42)
  image.metadata-image_name: nickg-test-platform-test-1698785148
  image.metadata-runner_image: ubuntu-22.04
  image.name: ubuntu-jammy-amd64-default-20231029_07:42
  image.os: ubuntu
  image.release: jammy
  image.serial: "20231029_07:42"
  image.variant: default
  security.nesting: "true"
  security.privileged: "true"
  user.keypair_name: nickg-test-platform-test
  user.runner_name: platform-test
  user.runner_owner: admin
  user.runner_state: registered
  volatile.apply_template: create
  volatile.base_image: ebc56e46d936ea922828a649c5e1363bbd972cb364462452f44b45a41d1c6d80
  volatile.cloud-init.instance-id: 23e8352e-4b6e-48c6-9a19-375abbe4ea6c
  volatile.eth0.hwaddr: 00:16:3e:f2:e8:f6
  volatile.idmap.base: "0"
  volatile.idmap.current: '[]'
  volatile.idmap.next: '[]'
  volatile.last_state.idmap: '[]'
  volatile.uuid: dd04fdc4-fa8f-4708-9b6d-6cabfddb59ed
  volatile.uuid.generation: dd04fdc4-fa8f-4708-9b6d-6cabfddb59ed
devices:
  eth0:
    name: eth0
    network: lxdbr0
    type: nic
  root:
    path: /
    pool: docker
    type: disk
ephemeral: false
profiles:
- default
stateful: false
description: ""

Ah, and more info - they don’t get assigned an ip address when they are restarted. I’ll update this post shortly with some more details…

Curiously, this bug isn’t happening for my colleague after they switched from lxd to incus…

$ incus version
Client version: 0.2
Server version: 0.2

And for me:

$ lxd --version
5.19

Thanks. Please could you open an issue for this over at Issues · canonical/lxd · GitHub with the info you have posted here so we can keep track of it. Thanks

I could reproduce the issue on OpenSuse 15.6 leap, using Incus 6.5, my container failed to start (due setuid misconfiguration), stopped incus to fix the missconfiguration but the network interface stayed there. Now that Incus forked from LXD I wonder if we should create again the issue on Incus github to keep track.

Hi!

Yes, you should file a new issue.

(I just saw your comment on the other thread.)