Incus, open vswitch and "dangling" veth interfaces

My problem is similar to this issue, but only in symptoms, I believe.

Running on ubuntu 23.04, with kernel 6.1.0-1027-oem.

I’ve got a pretty convoluted setup, so please bear with me, especially because it’s been a while since I’ve created this (by trial and error mostly).

My network uses VLANs (5, 10, 20, 30) and I wanted to be able to create containers connected to some of these. The solution I found at the time was to create an Open vSwitch bridge (called ovs-br0) along with “fake” bridges (vlan-5, vlan-10, vlan-20, etc.) for each VLAN, like described here.

I then created profiles for each vlan and used to launch instances using something like:

lxc launch images:debian/sid test --profile=vlan-dmz

where the vlan-dmz profile looked like:

config: {}
description: ""
devices:
  eth0:
    nictype: bridged
    parent: vlan-20
    type: nic
  root:
    path: /
    pool: zfs-storage
    type: disk
name: vlan-dmz
used_by:
- instances here...

Today, I migrated to incus (0.5.1) and the first showstopper was that I couldn’t launch anything anymore because of:

Error: Failed to start device "eth0": object not found

The way I worked around this was to edit the profiles and set eth0 to have ovs-br0 as a parent, along with vlan: XX , so , in incus, the profile looks like:

config: {}
description: ""
devices:
  eth0:
    nictype: bridged
    parent: ovs-br0
    type: nic
    vlan: "20"
  root:
    path: /
    pool: zfs-storage
    type: disk
name: vlan-dmz
used_by:
- ...instances here

Which leads me to the current issue: each time I start then stop a container, it “leaks” a pair of “veth” interfaces, which are left in the system.

This is how my ovs configuration looks like before creating a container:

# ovs-vsctl show 
# ec8a772c-2adf-4386-bd2b-4147badaa56b
    Bridge ovs-br0
        Port vlan-30
            tag: 30
            Interface vlan-30
                type: internal
        Port veth8e26cd6f
            tag: 20
            Interface veth8e26cd6f
        Port veth-rproxy
            tag: 20
            Interface veth-rproxy
        Port veth-mosquitto
            tag: 20
            Interface veth-mosquitto
        Port veth-saltmaster
            tag: 20
            Interface veth-saltmaster
        Port vlan-5
            tag: 5
            Interface vlan-5
                type: internal
        Port tap097e4528
            tag: 10
            Interface tap097e4528
        Port veth-pi-hole
            tag: 20
            Interface veth-pi-hole
        Port vlan-10
            tag: 10
            Interface vlan-10
                type: internal
        Port veth-netbox
            tag: 20
            Interface veth-netbox
        Port veth468f15cd
            tag: 20
            Interface veth468f15cd
        Port veth-has
            tag: 20
            Interface veth-has
        Port enp89s0
            trunks: [5, 10, 20, 30]
            Interface enp89s0
        Port veth-monitoring
            tag: 5
            Interface veth-monitoring
        Port ovs-br0
            Interface ovs-br0
                type: internal
        Port vethd2b911b5
            tag: 20
            Interface vethd2b911b5
        Port tap65c855e2
            tag: 30
            Interface tap65c855e2
        Port veth-node-red
            tag: 20
            Interface veth-node-red
        Port veth57a696d0
            tag: 10
            Interface veth57a696d0
        Port vlan-20
            tag: 20
            Interface vlan-20
                type: internal
    ovs_version: "3.1.3"

Launch a new one, using the vlan-dmz profile:

# incus launch images:debian/sid test --profile=vlan-dmz
# incus config show --expanded test
architecture: x86_64
config:
  image.architecture: amd64
  image.description: Debian sid amd64 (20240218_05:24)
  image.os: Debian
  image.release: sid
  image.serial: "20240218_05:24"
  image.type: squashfs
  image.variant: default
  volatile.base_image: e3dc089405783600d984a5c8766fe0f683a4360a63191b9d452f5d19944ac447
  volatile.cloud-init.instance-id: 48b99ae6-878c-431d-aa1a-3db0c91e8e62
  volatile.eth0.host_name: vethbc559be1
  volatile.eth0.hwaddr: 00:16:3e:2d:ae:44
  volatile.eth0.name: eth0
  volatile.idmap.base: "0"
  volatile.idmap.current: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
  volatile.idmap.next: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
  volatile.last_state.idmap: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
  volatile.last_state.power: RUNNING
  volatile.uuid: 20d3bae5-806e-4223-b93a-1dc7bd7d4815
  volatile.uuid.generation: 20d3bae5-806e-4223-b93a-1dc7bd7d4815
devices:
  eth0:
    nictype: bridged
    parent: ovs-br0
    type: nic
    vlan: "20"
  root:
    path: /
    pool: zfs-storage
    type: disk
ephemeral: false
profiles:
- vlan-dmz
stateful: false
description: ""

and the ovs configuration after starting the container:

# ovs-vsctl show 
ec8a772c-2adf-4386-bd2b-4147badaa56b
    Bridge ovs-br0
        Port vlan-30
            tag: 30
            Interface vlan-30
                type: internal
        Port veth8e26cd6f
            tag: 20
            Interface veth8e26cd6f
        Port veth-rproxy
            tag: 20
            Interface veth-rproxy
        Port veth-mosquitto
            tag: 20
            Interface veth-mosquitto
        Port veth-saltmaster
            tag: 20
            Interface veth-saltmaster
        Port vlan-5
            tag: 5
            Interface vlan-5
                type: internal
        Port tap097e4528
            tag: 10
            Interface tap097e4528
        Port veth-pi-hole
            tag: 20
            Interface veth-pi-hole
        Port vlan-10
            tag: 10
            Interface vlan-10
                type: internal
        Port vethbc559be1
            tag: 20
            Interface vethbc559be1
        Port veth-netbox
            tag: 20
            Interface veth-netbox
        Port veth468f15cd
            tag: 20
            Interface veth468f15cd
        Port veth-has
            tag: 20
            Interface veth-has
        Port enp89s0
            trunks: [5, 10, 20, 30]
            Interface enp89s0
        Port veth-monitoring
            tag: 5
            Interface veth-monitoring
        Port ovs-br0
            Interface ovs-br0
                type: internal
        Port vethd2b911b5
            tag: 20
            Interface vethd2b911b5
        Port tap65c855e2
            tag: 30
            Interface tap65c855e2
        Port veth-node-red
            tag: 20
            Interface veth-node-red
        Port veth57a696d0
            tag: 10
            Interface veth57a696d0
        Port vlan-20
            tag: 20
            Interface vlan-20
                type: internal
    ovs_version: "3.1.3"

When I stop the container, there are two veth interfaces left laying around:

# incus stop test

# ip link sh | grep vethbc559be1
35: veth394b2530@vethbc559be1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
36: vethbc559be1@veth394b2530: <NO-CARRIER,BROADCAST,MULTICAST,UP,M-DOWN> mtu 1500 qdisc noqueue master ovs-system state LOWERLAYERDOWN mode DEFAULT group default qlen 1000

In the journal, at the stop time, I get:

Feb 18 22:52:59 flatgate systemd-timesyncd[1038]: Network configuration changed, trying to establish connection.                   
Feb 18 22:52:59 flatgate kernel: physQb7GSQ: renamed from eth0                                                                                                                                            
Feb 18 22:52:59 flatgate systemd-networkd[1019]: eth0: Interface name change detected, renamed to physQb7GSQ.                                                                                             Feb 18 22:52:59 flatgate systemd-networkd[1019]: vethbc559be1: Lost carrier
Feb 18 22:52:59 flatgate systemd-timesyncd[1038]: Network configuration changed, trying to establish connection.
Feb 18 22:52:59 flatgate kernel: veth394b2530: renamed from physQb7GSQ
Feb 18 22:52:59 flatgate systemd-networkd[1019]: physQb7GSQ: Interface name change detected, renamed to veth394b2530.
Feb 18 22:52:59 flatgate systemd-timesyncd[1038]: Contacted time server 185.125.190.57:123 (ntp.ubuntu.com).
Feb 18 22:52:59 flatgate systemd-timesyncd[1038]: Network configuration changed, trying to establish connection.
Feb 18 22:52:59 flatgate systemd-timesyncd[1038]: Network configuration changed, trying to establish connection.
Feb 18 22:52:59 flatgate systemd-timesyncd[1038]: Network configuration changed, trying to establish connection.
Feb 18 22:52:59 flatgate systemd-timesyncd[1038]: Network configuration changed, trying to establish connection.
Feb 18 22:52:59 flatgate systemd-timesyncd[1038]: Network configuration changed, trying to establish connection.
Feb 18 22:52:59 flatgate systemd-timesyncd[1038]: Network configuration changed, trying to establish connection.
Feb 18 22:52:59 flatgate systemd-timesyncd[1038]: Network configuration changed, trying to establish connection.
Feb 18 22:52:59 flatgate ovs-vsctl[175504]: ovs|00001|vsctl|INFO|Called as ovs-vsctl --if-exists del-port ovs-br0 vethbc559be1
Feb 18 22:52:59 flatgate ovs-vsctl[175504]: ovs|00002|db_ctl_base|ERR|bridge ovs-br0 does not have a port vethbc559be1 (although its child bridge vlan-20 does)
Feb 18 22:52:59 flatgate incusd[1907]: time="2024-02-18T22:52:59+02:00" level=error msg="Failed to stop device" device=eth0 err="Failed to detach interface \"vethbc559be1\" from \"ovs-br0\": Failed to r
un: ovs-vsctl --if-exists del-port ovs-br0 vethbc559be1: exit status 1 (ovs-vsctl: bridge ovs-br0 does not have a port vethbc559be1 (although its child bridge vlan-20 does))" instance=test instanceType=
container project=default

Seems that the issued command is incorrect for my use case, it should have been ovs-vsctl del-port vethbc559be1 , but even after running this, the two dangling veths are still there.

And it seems that with each “start” / “stop” cycle of the container, another pair of veths is leaked.

Things become more interesting if I try to set the host_name for the interface:

incus config device override test eth0 host_name="veth-blaaa"

The second time I try to start the test container, because of the now leaked veth-blaaa interface, I get:

# incus start test
Error: Failed to start device "eth0": Failed to create the veth interfaces "veth-blaaa" and "veth69a75fbf": Failed adding link: Failed to run: ip link add name veth-blaaa mtu 1500 txqueuelen 1000 up type veth peer name veth69a75fbf mtu 1500 address 00:16:3e:2d:ae:44 txqueuelen 1000: exit status 2 (RTNETLINK answers: File exists)

So, maybe I misconfigured things, maybe it’s a bug, I don’t know… Opinions?

3 Likes

I encountered the same issue, major difference on my side I was using:

devices:
  eth0:
    host_name: development
    ipv4.address: 192.168.25.2
    network: incusbr0
    type: nic

I’m on Incus client/server version 6.5 compiled for OpenSuse Leap 15.6 (kernel version 6.4.0-150600.23.17-default).

Since I have a defined hostname I encountered the same error:

Error: Failed to start device "eth0": Failed to create the veth interfaces "development" and "vethfb5e2fdd": Failed adding link: Failed to run: ip link add name development mtu 1500 txqueuelen 1000 up type veth peer name vethfb5e2fdd mtu 1500 address 00:16:3e:18:06:e7 txqueuelen 1000: exit status 2 (RTNETLINK answers: File exists)

As a workaround I manually removed the interface, before starting my container:

sudo ip link delete 'development' type veth

Basically, judging by the issue @lcosmin linked, it sounds it’s the same bug and per Stéphane Graber’s comment, removing dangling interfaces is just a workaround.