OVN setup examples

I think I have OVN working on my cluster, but can’t get LXD OVN networks working - are there any examples of the configs for a working LXD OVN network?

I believe I need a bridge network with the OVN network(s) as child networks … but can’t see what’s the correct relationship between the various ranges and subnets.

Thanks
David

+1 I’m also looking for guidance on this

1 Like

https://linuxcontainers.org/lxd/docs/master/networks#network-ovn has a basic example.

https://github.com/lxc/lxc-ci/blob/master/bin/test-lxd-ovn is our daily test for OVN so may be useful too.

Otherwise, I’m sure @tomp will be happy to help with any questions and improve the docs where needed :slight_smile:

I wrote a little about the architecture of OVN and how to set it up on multiple nodes here:

That allows you to use lxc network create mynet --type=ovn network=<uplink network>

However the network you specify as the uplink network is will based on your own requirements, i.e do you want the OVN network’s router connected to an existing LXD bridge network, or to an external physical network.

For the physical network uplink (https://linuxcontainers.org/lxd/docs/master/networks#network-physical) you need to set ipv{n}.ovn.ranges to allow the OVN router to allocate an address on the uplink network, and dns.nameservers for the OVN router to include in DHCP responses to OVN NICs.

1 Like

I have a whole sequence of warnings in the syslog, I think from an earlier try at getting OVN working:

# journalctl -b 0 -o short-precise -u ovn* -u ovs* -u openvswitch* -u lxd -e
Apr 28 11:50:25.447981 albans ovn-controller[6331]: ovs|00315|patch|WARN|Bridge 'lxdovn25' not found for network 'k8sbr0'
Apr 28 11:51:25.448890 albans ovn-controller[6331]: ovs|00316|patch|WARN|Bridge 'lxdovn25' not found for network 'k8sbr0'
Apr 28 11:52:25.449200 albans ovn-controller[6331]: ovs|00317|patch|WARN|Bridge 'lxdovn25' not found for network 'k8sbr0'
Apr 28 11:53:25.449496 albans ovn-controller[6331]: ovs|00318|patch|WARN|Bridge 'lxdovn25' not found for network 'k8sbr0'
Apr 28 11:54:25.450258 albans ovn-controller[6331]: ovs|00319|patch|WARN|Bridge 'lxdovn25' not found for network 'k8sbr0'
Apr 28 11:55:25.450096 albans ovn-controller[6331]: ovs|00320|patch|WARN|Bridge 'lxdovn25' not found for network 'k8sbr0'
Apr 28 11:56:25.450353 albans ovn-controller[6331]: ovs|00321|patch|WARN|Bridge 'lxdovn25' not found for network 'k8sbr0'
Apr 28 11:57:25.450862 albans ovn-controller[6331]: ovs|00322|patch|WARN|Bridge 'lxdovn25' not found for network 'k8sbr0'
Apr 28 11:58:25.451603 albans ovn-controller[6331]: ovs|00323|patch|WARN|Bridge 'lxdovn25' not found for network 'k8sbr0'

there are many more before and they’re continuing.

$ lxc network list
+---------+----------+---------+--------------+------+---------------------------+---------+---------+
|  NAME   |   TYPE   | MANAGED |     IPV4     | IPV6 |        DESCRIPTION        | USED BY |  STATE  |
+---------+----------+---------+--------------+------+---------------------------+---------+---------+
| br-int  | bridge   | NO      |              |      |                           | 0       |         |
+---------+----------+---------+--------------+------+---------------------------+---------+---------+
| dmz0    | physical | NO      |              |      |                           | 0       |         |
+---------+----------+---------+--------------+------+---------------------------+---------+---------+
| eth0    | physical | NO      |              |      |                           | 0       |         |
+---------+----------+---------+--------------+------+---------------------------+---------+---------+
| lxdbr0  | bridge   | YES     | 10.99.0.1/16 | none | Default local LXD network | 2       | CREATED |
+---------+----------+---------+--------------+------+---------------------------+---------+---------+
| lxdfan0 | bridge   | YES     |              |      | LXD cluster network       | 3       | CREATED |
+---------+----------+---------+--------------+------+---------------------------+---------+---------+
# ovs-vsctl show
466b9882-6a72-4934-9dc9-1e939bb97950
    Bridge br-int
        Port br-int
            Interface br-int
                type: internal
    ovs_version: "2.13.1"

Any advice on what I should delete from where to fix this?

Can you show output of sudo ovn-nbctl show please

And also sudo ovs-vsctl list open_vswitch

# ovn-nbctl show produces no output.

# ovn-sbctl show
Chassis "486b381c-b94b-4172-978f-90635f048955"
    hostname: albans.domuz
    Encap geneve
        ip: "10.1.0.215"
        options: {csum="true"}
# ovs-vsctl list open_vswitch
_uuid               : 466b9882-6a72-4934-9dc9-1e939bb97950
bridges             : [04e7f203-69e8-4365-9e40-282877f98a80]
cur_cfg             : 17
datapath_types      : [netdev, system]
datapaths           : {}
db_version          : "8.2.0"
dpdk_initialized    : false
dpdk_version        : none
external_ids        : {hostname=albans.domuz, ovn-bridge-mappings="k8sbr0:lxdovn25", ovn-encap-ip="10.1.0.215", ovn-encap-type=geneve, ovn-remote="unix:/var/run/ovn/ovnsb_db.sock", rundir="/var/run/openvswitch", system-id="486b381c-b94b-4172-978f-90635f048955"}
iface_types         : [erspan, geneve, gre, internal, ip6erspan, ip6gre, lisp, patch, stt, system, tap, vxlan]
manager_options     : []
next_cfg            : 17
other_config        : {}
ovs_version         : "2.13.1"
ssl                 : []
statistics          : {}
system_type         : ubuntu
system_version      : "20.04"

Its the ovn-bridge-mappings key that is the issue.

If you do sudo ovs-vsctl remove openvswitch . external_ids ovn-bridge-mappings that should clear it.

Thanks. That’s fixed it. :slight_smile:

As a matter of interest, where is this config stored? I deleted the /var/lib/ovn databases but didn’t find the open_vswitch config files.

Or is there a simple command to wipe all these configs when I want to restart from a clean setup?

They are in /var/lib/openvswitch/ I think.

Hi,
I am also looking a fine documantation for ovn-lxd relation.
Regards.

Have you got the single node example in the docs working? That would be the first step, and then to think about how you want to connect the virtual routers to the external network via a physical port or a bridge (such as lxdbr0).

Thanks @tomp, I’m going to test a little bit in my environment and share the result.
Regards.

hi @tomp a quick Q about LXD OVN cluster setup:
I have 3 x hosts, clustered in LXD, each with ovn-host and ovn-central installed, and I’d like OVN to have some resilience so if the “controller” (10.1.0.215) fails (which hasn’t much resource, but has the ingress & egress proxies etc.) the other two continue untroubled.

My specific question is: what whould I set external_ids:ovn-remote to?
It is, at present, a unix socket but should this be the local IP Address or something else?

I have the config:
on each server:

# ovs-vsctl set open_vswitch . \
	external_ids:system-id=$( hostname ) \
	external_ids:ovn-remote=unix:/var/run/ovn/ovnsb_db.sock \
    external_ids:ovn-encap-type=geneve \
    external_ids:ovn-encap-ip=$( hostname -I | grep -o '\b10\.1\.0\.[0-9]\+\b' )

and /etc/default/ovn-central & /etc/default/ovn-host:

OVN_CTL_OPTS= \
  --db-nb-addr=$( hostname -I | grep -o '\b10\.1\.0\.[0-9]\+\b' ) \
  --db-sb-addr=$( hostname -I | grep -o '\b10\.1\.0\.[0-9]\+\b' ) \
  --db-nb-cluster-local-addr=$( hostname -I | grep -o '\b10\.1\.0\.[0-9]\+\b' ) \
  --db-sb-cluster-local-addr=$( hostname -I | grep -o '\b10\.1\.0\.[0-9]\+\b' ) \
  --db-nb-cluster-remote-addr=10.1.0.215 \
  --db-sb-cluster-remote-addr=10.1.0.215 \
  --ovn-northd-nb-db=tcp:10.1.0.215:6641,tcp:10.1.0.213:6641,tcp:10.1.0.214:6641 \
  --ovn-northd-sb-db=tcp:10.1.0.215:6642,tcp:10.1.0.213:6642,tcp:10.1.0.214:6642

I think you set it to the same as you’ve done for the OVN_CTL_OPTS's ovn-northd-sb-db setting, i.e tcp:10.1.0.215:6642,tcp:10.1.0.213:6642,tcp:10.1.0.214:6642.

Also, ensure LXD knows about the multiple northbound DBs by using:

lxc config set network.ovn.northbound_connection=tcp:10.1.0.215:6641,tcp:10.1.0.213:6641,tcp:10.1.0.214:6641

This is important, as although right now we use ovn-nbctl under the hood, this may not always be the case (we are thinking about interacting with the DB directly), and so LXD will then not use the OVN_CTL_OPTS setting.

seems to be working for now. :slight_smile: :crossed_fingers:

Hmmm … that change seems to have broken LXD - OVN so I’ve reverted both changes. external_ids:ovn-remote is set as unix:/var/run/ovn/ovnsb_db.sock and lxc config network.ovn.northbound_connection is now set as unix:/var/run/ovn/ovnnb_db.sock.

with the suggested settings:
edit network lxdbr0 to have:


  ipv4.address: 10.1.1.1/24

  ipv4.dhcp.ranges: 10.1.1.8-10.1.1.127
  ipv4.ovn.ranges: 10.1.1.128-10.1.1.251
  ipv4.routes: 10.3.128.0/17, 241.0.0.0/8

then …

$ lxc network create test-ovn --type=ovn network=lxdbr0
Error: Failed to run: ovn-nbctl --db tcp:10.1.0.215:6641,tcp:10.1.0.213:6641,tcp:10.1.0.214:6641 ha-chassis-group-add lxd-net46: ovn-nbctl: tcp:10.1.0.215:6641,tcp:10.1.0.213:6641,tcp:10.1.0.214:6641: database connection failed (Connection refused)

$ lxc config set network.ovn.northbound_connection=unix:/var/run/ovn/ovnnb_db.sock
$ lxc network delete test-ovn 
Network test-ovn deleted

$ lxc network create test-ovn --type=ovn network=lxdbr0
Error: Failed getting OVS Chassis ID: invalid syntax
# ovs-vsctl list open_vswitch
[sudo] password for albans: 
_uuid               : 466b9882-6a72-4934-9dc9-1e939bb97950
bridges             : [04e7f203-69e8-4365-9e40-282877f98a80]
cur_cfg             : 21
datapath_types      : [netdev, system]
datapaths           : {}
db_version          : "8.2.0"
dpdk_initialized    : false
dpdk_version        : none
external_ids        : {hostname=albans.domuz, ovn-encap-ip="10.1.0.215", ovn-encap-type="geneve,vxlan", ovn-openflow-probe-interval="15000", ovn-remote="tcp:10.1.0.215:6641,tcp:10.1.0.213:6641,tcp:10.1.0.214:6641", ovn-remote-probe-interval="5000", rundir="/var/run/openvswitch", system-id=albans}
iface_types         : [erspan, geneve, gre, internal, ip6erspan, ip6gre, lisp, patch, stt, system, tap, vxlan]
manager_options     : []
next_cfg            : 21
other_config        : {}
ovs_version         : "2.13.1"
ssl                 : []
statistics          : {}
system_type         : ubuntu
system_version      : "20.04"

# ovs-vsctl show
466b9882-6a72-4934-9dc9-1e939bb97950
    Bridge br-int
        Port br-int
            Interface br-int
                type: internal
    ovs_version: "2.13.1"

# ovn-nbctl show
switch cec22a68-4d83-47ec-8331-5ad314cfc557 (lxd-net47-ls-ext)
    port lxd-net47-ls-ext-lsp-router
        type: router
        router-port: lxd-net47-lr-lrp-ext
    port lxd-net47-ls-ext-lsp-provider
        type: localnet
        addresses: ["unknown"]
switch fa827850-c229-4602-88a6-2592001ac3e8 (lxd-net48-ls-int)
    port lxd-net48-ls-int-lsp-router
        type: router
        router-port: lxd-net48-lr-lrp-int
switch 607470f8-fe19-4408-80d7-20a736f73a22 (lxd-net47-ls-int)
    port lxd-net47-ls-int-lsp-router
        type: router
        router-port: lxd-net47-lr-lrp-int
switch 44adb685-a6bf-418f-8060-7aa307f5c110 (lxd-net48-ls-ext)
    port lxd-net48-ls-ext-lsp-provider
        type: localnet
        addresses: ["unknown"]
    port lxd-net48-ls-ext-lsp-router
        type: router
        router-port: lxd-net48-lr-lrp-ext
router a31763a4-5706-4c8a-a8f8-e0f9a10ee514 (lxd-net48-lr)
    port lxd-net48-lr-lrp-int
        mac: "00:16:3e:09:d5:5d"
        networks: ["10.230.167.1/24"]
    port lxd-net48-lr-lrp-ext
        mac: "00:16:3e:09:d5:5d"
        networks: ["10.1.1.128/24"]
    nat a11f6768-b77d-4c02-b84d-b7934b35d81c
        external ip: "10.1.1.128"
        logical ip: "10.230.167.0/24"
        type: "snat"
router 91203d42-3c92-4c55-aa05-f23acbd094b3 (lxd-net47-lr)
    port lxd-net47-lr-lrp-ext
        mac: "00:16:3e:f7:eb:62"
        networks: ["10.1.1.128/24"]
    port lxd-net47-lr-lrp-int
        mac: "00:16:3e:f7:eb:62"
        networks: ["10.4.194.1/24"]
    nat bd1ab8ff-0322-48c8-af83-c34fa0eda54a
        external ip: "10.1.1.128"
        logical ip: "10.4.194.0/24"
        type: "snat"

# ovn-sbctl show
<<nothing>>

# ovn-appctl connection-status
not connected

#  ovs-vsctl set open_vswitch . \
 external_ids:system-id=$( hostname ) \
     external_ids:ovn-remote-probe-interval=5000 \
     external_ids:ovn-openflow-probe-interval=15000 \
 external_ids:ovn-remote=unix:/var/run/ovn/ovnsb_db.sock \
     external_ids:ovn-encap-type=geneve,vxlan \
     external_ids:ovn-encap-ip=$( hostname -I | grep -o '\b10\.1\.0\.[0-9]\+\b' )

<<stop & start the ovn-* services on each host>>

# ovn-appctl connection-status
connected

I’ve edited lxdbr0 to remove the ovn config parameters, so there’s no OVN left in LXD. Is it safe to just delete all the lxd* routers & switches which are listed?
Or better to just clear all the /var/lib/{ovn,ovs,open_vswitch}/* databases and start again?

Left out some detail. On each host:

# ovn-sbctl show
Chassis "486b381c-b94b-4172-978f-90635f048955"
    hostname: albans.domuz
    Encap vxlan
        ip: "10.1.0.215"
        options: {csum="true"}
    Encap geneve
        ip: "10.1.0.215"
        options: {csum="true"}

# ovn-sbctl show
Chassis "0393361e-b4dc-4241-8479-e8ef6849c4c6"
    hostname: grantham.domuz
    Encap geneve
        ip: "10.1.0.213"
        options: {csum="true"}
    Encap vxlan
        ip: "10.1.0.213"
        options: {csum="true"}

# ovn-sbctl show
Chassis "9267a9f6-8de7-45b0-b546-701510dc1591"
    hostname: uxbridge.domuz
    Encap vxlan
        ip: "10.1.0.214"
        options: {csum="true"}
    Encap geneve
        ip: "10.1.0.214"
        options: {csum="true"}

Good morning @tomp and thanks for your help so far.

Removing vxlan from the list of encapsulations got me a step further, and created a new crash:

$ sudo lxc network create test-ovn --type=ovn network=lxdbr0
Error: failed to notify peer 10.1.0.213:8443: Failed adding OVS chassis "0393361e-b4dc-4241-8479-e8ef6849c4c6" with priority 5421 to chassis group "lxd-net50": Failed to run: ovn-nbctl --db unix:/var/lib/snapd/hostfs/run/ovn/ovnnb_db.sock ha-chassis-group-add-chassis lxd-net50 0393361e-b4dc-4241-8479-e8ef6849c4c6 5421: 2021-05-05T05:32:29Z|00002|ovsdb_idl|WARN|OVN_Northbound database lacks BFD table (database needs upgrade?)
2021-05-05T05:32:29Z|00003|ovsdb_idl|WARN|Forwarding_Group table in OVN_Northbound database lacks external_ids column (database needs upgrade?)
2021-05-05T05:32:29Z|00004|ovsdb_idl|WARN|Load_Balancer table in OVN_Northbound database lacks options column (database needs upgrade?)
2021-05-05T05:32:29Z|00005|ovsdb_idl|WARN|Load_Balancer table in OVN_Northbound database lacks selection_fields column (database needs upgrade?)
2021-05-05T05:32:29Z|00006|ovsdb_idl|WARN|Logical_Router_Policy table in OVN_Northbound database lacks external_ids column (database needs upgrade?)
2021-05-05T05:32:29Z|00007|ovsdb_idl|WARN|Logical_Router_Policy table in OVN_Northbound database lacks nexthops column (database needs upgrade?)
2021-05-05T05:32:29Z|00008|ovsdb_idl|WARN|Logical_Router_Policy table in OVN_Northbound database lacks options column (database needs upgrade?)
2021-05-05T05:32:29Z|00009|ovsdb_idl|WARN|Logical_Router_Port table in OVN_Northbound database lacks ipv6_prefix column (database needs upgrade?)
2021-05-05T05:32:29Z|00010|ovsdb_idl|WARN|Logical_Router_Static_Route table in OVN_Northbound database lacks bfd column (database needs upgrade?)
2021-05-05T05:32:29Z|00011|ovsdb_idl|WARN|Logical_Router_Static_Route table in OVN_Northbound database lacks options column (database needs upgrade?)
2021-05-05T05:32:29Z|00012|ovsdb_idl|WARN|Meter table in OVN_Northbound database lacks fair column (database needs upgrade?)
2021-05-05T05:32:29Z|00013|ovsdb_idl|WARN|NAT table in OVN_Northbound database lacks allowed_ext_ips column (database needs upgrade?)
2021-05-05T05:32:29Z|00014|ovsdb_idl|WARN|NAT table in OVN_Northbound database lacks exempted_ext_ips column (database needs upgrade?)
2021-05-05T05:32:29Z|00015|ovsdb_idl|WARN|NAT table in OVN_Northbound database lacks external_port_range column (database needs upgrade?)
2021-05-05T05:32:29Z|00016|ovsdb_idl|WARN|NB_Global table in OVN_Northbound database lacks hv_cfg_timestamp column (database needs upgrade?)
2021-05-05T05:32:29Z|00017|ovsdb_idl|WARN|NB_Global table in OVN_Northbound database lacks nb_cfg_timestamp column (database needs upgrade?)
2021-05-05T05:32:29Z|00018|ovsdb_idl|WARN|NB_Global table in OVN_Northbound database lacks sb_cfg_timestamp column (database needs upgrade?)
ovn-nbctl: lxd-net50: ha_chassi_group name not found

Logs on the 3 hosts:
10.1.0.215 (controller):
<< nothing in that time period >>

10.1.0.213:

May 05 05:32:29.778736 grantham ovsdb-server[4290]: ovs|00005|jsonrpc|WARN|unix#0: receive error: Connection reset by peer
May 05 05:32:29.778861 grantham ovsdb-server[4290]: ovs|00006|reconnect|WARN|unix#0: connection dropped (Connection reset by peer)
May 05 05:32:29.885061 grantham ovsdb-server[4290]: ovs|00007|jsonrpc|WARN|unix#2: receive error: Connection reset by peer
May 05 05:32:29.885185 grantham ovsdb-server[4290]: ovs|00008|reconnect|WARN|unix#2: connection dropped (Connection reset by peer)
May 05 05:32:29.908084 grantham ovsdb-server[4290]: ovs|00009|jsonrpc|WARN|unix#3: receive error: Connection reset by peer
May 05 05:32:29.908203 grantham ovsdb-server[4290]: ovs|00010|reconnect|WARN|unix#3: connection dropped (Connection reset by peer)

4290 is the PID of ovnnb_db

10.1.0.214:

May 05 05:32:29.742857 uxbridge ovsdb-server[4583]: ovs|00005|jsonrpc|WARN|unix#0: receive error: Connection reset by peer
May 05 05:32:29.742910 uxbridge ovsdb-server[4583]: ovs|00006|reconnect|WARN|unix#0: connection dropped (Connection reset by peer)
May 05 05:32:29.826524 uxbridge ovsdb-server[4583]: ovs|00007|jsonrpc|WARN|unix#2: receive error: Connection reset by peer
May 05 05:32:29.826584 uxbridge ovsdb-server[4583]: ovs|00008|reconnect|WARN|unix#2: connection dropped (Connection reset by peer)
May 05 05:32:29.841995 uxbridge ovsdb-server[4583]: ovs|00009|jsonrpc|WARN|unix#3: receive error: Connection reset by peer
May 05 05:32:29.842055 uxbridge ovsdb-server[4583]: ovs|00010|reconnect|WARN|unix#3: connection dropped (Connection reset by peer)

4583 is also ovnnb_db
apart from, what looks to me, a spelling mistake ha_chassi_group, it looks like there’s a connection problem between the hosts. I’m investigating.