Error starting instance with unmanaged bridge about `ovs-vsctl` not being found

I’m trying to use LXD to create an instance with an unmanaged bridge network. But whenever I try to lxc start the instance, I get an error about the ovs-vsctl binary not being able to be found (which is strange, since I’m not trying to use the OVN stuff at all).

Here’s the exact error:

$ lxc start lxc-nixos
Error: Failed to start device "mylxdbr0": Failed to run: ovs-vsctl --may-exist add-port mylxdbr0 vethb9e5f4d9: exec: "ovs-vsctl": executable file not found in $PATH
Try `lxc info --show-log lxc-nixos` for more info

Trying the suggested log command doesn’t return anything:

$ lxc info --show-log lxc-nixos
Name: lxc-nixos
Status: STOPPED
Type: container
Architecture: x86_64
Created: 2022/07/17 12:16 EDT
Last Used: 2022/09/06 12:49 EDT

Log:


Nothing shows in up journalctl either.

I’m on NixOS, using LXD 5.5 (and I also tried on an older version of NixOS using LXD 5.1, but got the same error). It is possible that this is not an LXD bug, but that LXD is either not packaged correctly on NixOS, or not being run correctly. If so, I’d like to figure that out and fix it in NixOS.

The image I’m trying to run is NixOS as well, but my guess is that is not related to the above error.


Here’s the info about my current setup.

The lxc-nixos instance’s configuration:

$ lxc config show lxc-nixos
architecture: x86_64
config:
  image.description: NixOS Quokka 22.05.1711.c06d5fa9c60 x86_64-linux
  image.os: nixos
  image.release: Quokka
  security.nesting: "true"
  volatile.base_image: 3ea75b692d6ea1d6693d35f5cf4229e930284d65540b10c7f11fc8e57aa11508
  volatile.cloud-init.instance-id: 7717057b-8513-4d6a-ad06-9c2c78a61f6d
  volatile.idmap.base: "0"
  volatile.idmap.current: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":65536},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":65536}]'
  volatile.idmap.next: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":65536},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":65536}]'
  volatile.last_state.idmap: '[]'
  volatile.last_state.power: STOPPED
  volatile.mylxdbr0.hwaddr: 00:16:3e:1f:f3:78
  volatile.uuid: eb2c5062-878d-48b4-a0c9-ea9e40354e04
devices:
  mylxdbr0:
    name: eth1
    nictype: bridged
    parent: mylxdbr0
    type: nic
ephemeral: false
profiles:
- default
stateful: false
description: ""

The important device here is mylxdbr0. My intent here is for the instance to get an interface created within it called eth1, and it uses the mylxdbr0 bridge interface on my host.

The default profile doesn’t have any network devices and is otherwise uninteresting:

$ lxc profile show default
config: {}
description: Default LXD profile
devices:
  root:
    path: /
    pool: default
    type: disk
name: default
used_by:
- /1.0/instances/lxc-nixos

There are no LXD network devices defined aside from the wifi card in my laptop:

$ lxc network list
+-----------+----------+---------+------+------+-------------+---------+-------+
|   NAME    |   TYPE   | MANAGED | IPV4 | IPV6 | DESCRIPTION | USED BY | STATE |
+-----------+----------+---------+------+------+-------------+---------+-------+
| wlp0s20f3 | physical | NO      |      |      |             | 0       |       |
+-----------+----------+---------+------+------+-------------+---------+-------+

Here’s the mylxdbr0 interface on my host:

$ ifconfig mylxdbr0
mylxdbr0: flags=4099<UP,BROADCAST,MULTICAST>  mtu 1500
        inet 192.168.57.1  netmask 255.255.255.0  broadcast 0.0.0.0
        ether 8a:92:d2:06:61:52  txqueuelen 1000  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

(I’ve created this interface by defining it in my NixOS configuration, but if necessary I could dig through the source code and pull out which commands are actually being run to create it.)

Here are the routes I’ve setup for the mylxdbr0 interface on my host:

$ route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         10.0.0.1        0.0.0.0         UG    600    0        0 wlp0s20f3
10.0.0.0        0.0.0.0         255.255.255.0   U     600    0        0 wlp0s20f3
192.168.57.0    0.0.0.0         255.255.255.0   U     0      0        0 mylxdbr0

Namely the last line there.

This bridge doesn’t show up in the output of bridge link show, which might mean that it has been created incorrectly and is not actually a bridge…?

$ sudo bridge link show
$

My intention is to use this instance to access the internet while NAT’ing it behind the host, so I’ve added the following iptables rules as well (although I doubt they are related to the error I’m seeing above):

iptables -A INPUT -i mylxdbr0 -m comment --comment "my rule for LXD network mylxdbr0" -j ACCEPT
iptables -A FORWARD -o mylxdbr0 -m comment --comment "my rule for LXD network mylxdbr0" -j ACCEPT
iptables -A FORWARD -i mylxdbr0 -m comment --comment "my rule for LXD network mylxdbr0" -j ACCEPT
iptables -A OUTPUT -o mylxdbr0 -m comment --comment "my rule for LXD network mylxdbr0" -j ACCEPT

iptables -t nat -A POSTROUTING -s 192.168.57.0/24 ! -d 192.168.57.0/24 -m comment --comment "my rule for LXD network mylxdbr0" -j MASQUERADE

These were basically just copied from similar rules created by LXD when you allow it to create a managed bridge. I don’t really know what I’m doing here.

I’m happy to add any more debugging information if necessary.

I think I figured out my problem. My “bridge” interface mylxdbr0 was not an actual bridge interface, so LXD didn’t know what to do with it.

I figured this out by somewhat aimlessly clicking around the LXD source code. I found:

This made me think that LXD expects bridges to have a /sys/class/net/%s/bridge directory available. I checked /sys/class/net/mylxdbr0/bridge and it did not exist.


I had created this mylxdbr0 interface with the following NixOS config:

networking.interfaces.mylxdbr0 = {
  name = "mylxdbr0";
  virtual = true;
  useDHCP = false;
  ipv4.addresses = [ { address = "192.168.57.1"; prefixLength = 24; } ];
  ipv4.routes = [ { address = "192.168.57.0"; prefixLength = 24; } ];
};

This causes the interface to get created with a command like:

$ ip tuntap add dev "mylxdbr0" mode "tun" user "root"

This tun interface is apparently not a bridge interface.

What I ended up doing that worked was:

networking.bridges = { mylxdbr0.interfaces = []; };

This causes the interface to get created with a command like the following:

$ ip link add name "mylxdbr0" type bridge

I then had to explicitly add an IP to my interface:

$ sudo ip address add 192.168.57.1/24 dev mylxdbr0

This enabled me to successfully run lxc start lxc-nixos:

$ lxc list
+-----------+---------+----------------------+------+-----------+-----------+
|   NAME    |  STATE  |         IPV4         | IPV6 |   TYPE    | SNAPSHOTS |
+-----------+---------+----------------------+------+-----------+-----------+
| lxc-nixos | RUNNING | 192.168.57.50 (eth1) |      | CONTAINER | 0         |
|           |         | 172.17.0.1 (docker0) |      |           |           |
+-----------+---------+----------------------+------+-----------+-----------+

I’m able to communicate between my host and the guest. Now all I need to do is figure out how to setup the NAT right so the guest can access the internet.

1 Like

In case anyone is interested, I put together a post explaining how I setup LXD on NixOS using an unmanaged bridge network interface:

1 Like