LXD OVN Cluster fails to create my-ovn network

I am following the tutorial here, but get stuck at the step where I create the my-ovn network. I installed and configured OVN from scratch from this tutorial. I get the following error:

lxc network create my-ovn --type=ovn network=UPLINK

Error: Failed to run: ovn-nbctl --timeout=10 --db tcp:192.168.0.181:6641,tcp:192.168.0.191:6641,tcp:192.168.0.235:6641 --wait=sb ha-chassis-group-add lxd-net16: exit status 1 (2022-09-01T21:01:41Z|00003|ovsdb_idl|WARN|transaction error: {"details":"Transaction causes multiple rows in \"HA_Chassis_Group\" table to have identical values (lxd-net16) for index on column \"name\".  First row, with UUID 71c74eb8-c217-4786-bb3e-55d02578d57f, existed in the database before this transaction and was not modified by the transaction.  Second row, with UUID c9d989e9-c603-4b2d-8662-18803b819d45, was inserted by this transaction.","error":"constraint violation"}
ovn-nbctl: transaction error: {"details":"Transaction causes multiple rows in \"HA_Chassis_Group\" table to have identical values (lxd-net16) for index on column \"name\".  First row, with UUID 71c74eb8-c217-4786-bb3e-55d02578d57f, existed in the database before this transaction and was not modified by the transaction.  Second row, with UUID c9d989e9-c603-4b2d-8662-18803b819d45, was inserted by this transaction.","error":"constraint violation"})

Here are the relevant configs:

lxc network show UPLINK

config:
  dns.nameservers: 8.8.8.8
  ipv4.gateway: 10.0.1.1/24
  ipv4.ovn.ranges: 10.0.1.2-10.0.1.254
  volatile.last_state.created: "false"
description: ""
name: UPLINK
type: physical
used_by: []
managed: true
status: Created
locations:
- daphne
- yogi
- gazoo

The bridged network is configured on each node as br0 with the following netplan config:

network:
  ethernets:
    eno1:
      dhcp4: true
  version: 2
  bridges:
    br0:
      addresses: [ 10.0.1.0/24 ]
      interfaces: [ vlan0 ]
  vlans:
    vlan0:
      id: 0
      link: eno1

I’m definitely an OVN n00b, so I probably made a mistake somewhere. Why is this going wrong?

Please can you show me the output of:

ovn-nbctl --timeout=10 --db tcp:192.168.0.181:6641,tcp:192.168.0.191:6641,tcp:192.168.0.235:6641 --wait=sb list HA_Chassis_Group

And

ovn-nbctl --timeout=10 --db tcp:192.168.0.181:6641,tcp:192.168.0.191:6641,tcp:192.168.0.235:6641 --wait=sb list HA_Chassis

Of course. The first command returns:

_uuid               : 098bd5e0-eaf6-4da9-bc6b-ad155212ce7d
external_ids        : {}
ha_chassis          : []
name                : lxd-net14

_uuid               : 71c74eb8-c217-4786-bb3e-55d02578d57f
external_ids        : {}
ha_chassis          : []
name                : lxd-net16

_uuid               : 65d08b90-82ff-46d3-b8c0-0cceaff8f4f4
external_ids        : {}
ha_chassis          : []
name                : lxd-net15

The second command has no output.

Please can you show lxc network list and lxc cluster list please.

Have you created other OVN networks already?

No, I haven’t used OVN at all other than what was asked to be done in the tutorial. Here is the requested output:

$ lxc network ls
+--------+----------+---------+---------------+---------------------------+-------------+---------+---------+
|  NAME  |   TYPE   | MANAGED |     IPV4      |           IPV6            | DESCRIPTION | USED BY |  STATE  |
+--------+----------+---------+---------------+---------------------------+-------------+---------+---------+
| UPLINK | physical | YES     |               |                           |             | 0       | CREATED |
+--------+----------+---------+---------------+---------------------------+-------------+---------+---------+
| br0    | bridge   | NO      |               |                           |             | 1       |         |
+--------+----------+---------+---------------+---------------------------+-------------+---------+---------+
| br-int | bridge   | NO      |               |                           |             | 0       |         |
+--------+----------+---------+---------------+---------------------------+-------------+---------+---------+
| eno1   | physical | NO      |               |                           |             | 1       |         |
+--------+----------+---------+---------------+---------------------------+-------------+---------+---------+
| eno2   | physical | NO      |               |                           |             | 0       |         |
+--------+----------+---------+---------------+---------------------------+-------------+---------+---------+
| eno3   | physical | NO      |               |                           |             | 0       |         |
+--------+----------+---------+---------------+---------------------------+-------------+---------+---------+
| eno4   | physical | NO      |               |                           |             | 0       |         |
+--------+----------+---------+---------------+---------------------------+-------------+---------+---------+
| lan    | macvlan  | YES     |               |                           |             | 1       | CREATED |
+--------+----------+---------+---------------+---------------------------+-------------+---------+---------+
| lxdbr0 | bridge   | YES     | 10.79.71.1/24 | fd42:8858:7a68:87a4::1/64 |             | 11      | CREATED |
+--------+----------+---------+---------------+---------------------------+-------------+---------+---------+
| ovnnet | ovn      | YES     | 10.44.0.1/24  | fd42:d36c:7b12:66ca::1/64 |             | 0       | ERRORED |
+--------+----------+---------+---------------+---------------------------+-------------+---------+---------+
| virbr0 | bridge   | NO      |               |                           |             | 0       |         |
+--------+----------+---------+---------------+---------------------------+-------------+---------+---------+
| vlan0  | vlan     | NO      |               |                           |             | 0       |         |
+--------+----------+---------+---------------+---------------------------+-------------+---------+---------+
$ lxc cluster list
+--------+----------------------------+-----------------+--------------+----------------+-------------+--------+-------------------+
|  NAME  |            URL             |      ROLES      | ARCHITECTURE | FAILURE DOMAIN | DESCRIPTION | STATE  |      MESSAGE      |
+--------+----------------------------+-----------------+--------------+----------------+-------------+--------+-------------------+
| daphne | https://192.168.0.181:8443 | database        | x86_64       | default        |             | ONLINE | Fully operational |
+--------+----------------------------+-----------------+--------------+----------------+-------------+--------+-------------------+
| gazoo  | https://192.168.0.237:8443 | database        | x86_64       | default        |             | ONLINE | Fully operational |
+--------+----------------------------+-----------------+--------------+----------------+-------------+--------+-------------------+
| yogi   | https://192.168.0.191:8443 | database-leader | x86_64       | default        |             | ONLINE | Fully operational |
|        |                            | database        |              |                |             |        |                   |
+--------+----------------------------+-----------------+--------------+----------------+-------------+--------+-------------------+

Note: I used “ovnnet” instead of “my-ovn” for the network name, but used the same commands from the tutorial.

OK thanks, looks like you may have some left over/stale config in OVN, please can you show me the output of:

ovn-nbctl --timeout=10 --db tcp:192.168.0.181:6641,tcp:192.168.0.191:6641,tcp:192.168.0.235:6641 --wait=sb show

Can you try removing the ERRORED ovnnet network using:

lxc network remove ovnnet

And then show the output of:

ovn-nbctl --timeout=10 --db tcp:192.168.0.181:6641,tcp:192.168.0.191:6641,tcp:192.168.0.235:6641 --wait=sb show

And

ovn-nbctl --timeout=10 --db tcp:192.168.0.181:6641,tcp:192.168.0.191:6641,tcp:192.168.0.235:6641 --wait=sb list HA_Chassis_Group

I removed the network. The second command had no output, the last command’s output is below.

$ ovn-nbctl --timeout=10 --db tcp:192.168.0.181:6641,tcp:192.168.0.191:6641,tcp:192.168.0.235:6641 --wait=sb list HA_Chassis_Group

_uuid               : 098bd5e0-eaf6-4da9-bc6b-ad155212ce7d
external_ids        : {}
ha_chassis          : []
name                : lxd-net14

_uuid               : 71c74eb8-c217-4786-bb3e-55d02578d57f
external_ids        : {}
ha_chassis          : []
name                : lxd-net16

_uuid               : 65d08b90-82ff-46d3-b8c0-0cceaff8f4f4
external_ids        : {}
ha_chassis          : []
name                : lxd-net15

Can you try removing them using:

ovn-nbctl --timeout=10 --db tcp:192.168.0.181:6641,tcp:192.168.0.191:6641,tcp:192.168.0.235:6641 --wait=sb destroy HA_Chassis_Group lxd-net14
ovn-nbctl --timeout=10 --db tcp:192.168.0.181:6641,tcp:192.168.0.191:6641,tcp:192.168.0.235:6641 --wait=sb destroy HA_Chassis_Group lxd-net15
ovn-nbctl --timeout=10 --db tcp:192.168.0.181:6641,tcp:192.168.0.191:6641,tcp:192.168.0.235:6641 --wait=sb destroy HA_Chassis_Group lxd-net16

And then re-run:

ovn-nbctl --timeout=10 --db tcp:192.168.0.181:6641,tcp:192.168.0.191:6641,tcp:192.168.0.235:6641 --wait=sb list HA_Chassis_Group

To confirm they are gone, and then try adding the network again, this time recording the exact commands being run on each cluster member.

Okay, I’m an idiot. Thank you so much for your patience Tom!

I created the OVN network entirely referencing a non-existent node: 192.168.0.235. That was an old IP it had, but it is now at .237. I updated all the OVN configs, recreated the UPLINK network, and now the LXD OVN network was created without issue.

I can ping containers on other nodes with private IP and DNS, but I noticed the IP range is different than what I set (i.e. 10.0.1.0/24):

+-----------+---------+---------------------+-----------------------------------------------+-----------------+-----------+----------+
| c1        | RUNNING | 10.154.243.2 (eth0) | fd42:4e6a:c846:1bb1:216:3eff:fee5:d61f (eth0) | CONTAINER       | 0         | yogi     |
+-----------+---------+---------------------+-----------------------------------------------+-----------------+-----------+----------+
| c2        | RUNNING | 10.154.243.3 (eth0) | fd42:4e6a:c846:1bb1:216:3eff:feea:851 (eth0)  | CONTAINER       | 0         | yogi     |
+-----------+---------+---------------------+-----------------------------------------------+-----------------+-----------+----------+
| c3        | RUNNING | 10.154.243.4 (eth0) | fd42:4e6a:c846:1bb1:216:3eff:fecf:73ed (eth0) | CONTAINER       | 0         | gazoo    |
+-----------+---------+---------------------+-----------------------------------------------+-----------------+-----------+----------+

Not a big deal since the core functionality works, so I consider this resolved. Thanks again for your time and help.

1 Like

Great!

Can you show me the create command you used for the network and “lxc network show (network)” for the resulting network as I’d like to understand why you’re getting unexpected ip subnet.

Of course:

$ lxc network create UPLINK --type=physical parent=br0 --target=daphne
$ lxc network create UPLINK --type=physical parent=br0 --target=yogi
$ lxc network create UPLINK --type=physical parent=br0 --target=gazoo
$ lxc network create UPLINK --type=physical ipv4.ovn.ranges=10.0.1.2-10.0.1.254  ipv4.gateway=10.0.1.1/24 dns.nameservers=8.8.8.8
$ lxc network create ovnnet --type=ovn network=UPLINK
$ lxc network show ovnnet
config:
  bridge.mtu: "1442"
  ipv4.address: 10.154.243.1/24
  ipv4.nat: "true"
  ipv6.address: fd42:4e6a:c846:1bb1::1/64
  ipv6.nat: "true"
  network: UPLINK
  volatile.network.ipv4.address: 10.0.1.2
description: ""
name: ovnnet
type: ovn
used_by: []
managed: true
status: Created
locations:
- daphne
- yogi
- gazoo

Ah ok so this is working as designed.

You can create and connect multiple ovn networks to the same uplink network (potentially all using the same internal subnet). Each ovn network will be allocated one ip from the uplink’s ipv4.ovn.ranges for use with its own virtual ovn router as the external Nat ip. Then all egress traffic from the instances connected to that ovn network will be natted to that ip.

The volatile.network.ipv4.address setting shows the external ip on the uplink network that is allocated to each ovn network.

If you create another ovn network connected to the same uplink you will see its assigned its own ip on the uplink network.

1 Like