Lxc network create aovn --type=ovn command hangs the command line and create a network with error state

Hi,
I built a cluster system with ovn but at the last stage of the network creation hold the command line and doesnt answer anymore. Can someone help me out what may be wrong with the configuration.
Regards.

indiana@pinehost1:~$ uname -a
Linux pinehost1 5.15.43-sunxi64 #22.05.1 SMP Sat May 28 08:25:20 UTC 2022 aarch64 GNU/Linux
indiana@pinehost1:~$ lxc network show UPLINK
config:
  dns.nameservers: 8.8.8.8
  ipv4.gateway: 192.168.1.1/24
  ipv4.ovn.ranges: 192.168.1.200-192.168.1.254
  volatile.last_state.created: "false"
description: ""
name: UPLINK
type: physical
used_by: []
managed: true
status: Created
locations:
- pinehost3
- pinehost1
- pinehost2
indiana@pinehost1:~$ lxc network ls
+--------+----------+---------+------+------+-------------+---------+---------+
|  NAME  |   TYPE   | MANAGED | IPV4 | IPV6 | DESCRIPTION | USED BY |  STATE  |
+--------+----------+---------+------+------+-------------+---------+---------+
| UPLINK | physical | YES     |      |      |             | 0       | CREATED |
+--------+----------+---------+------+------+-------------+---------+---------+
| br-int | bridge   | NO      |      |      |             | 0       |         |
+--------+----------+---------+------+------+-------------+---------+---------+
| eth0   | physical | NO      |      |      |             | 1       |         |
+--------+----------+---------+------+------+-------------+---------+---------+
indiana@pinehost1:~$ lxc config show
config:
  cluster.https_address: 192.168.1.20:8443
  core.https_address: 192.168.1.20:8443
  network.ovn.northbound_connection: tcp:192.168.1.20:6641,tcp:192.168.1.21:6641,tcp:192.168.1.22:6641

ovn service status

● ovn-ovsdb-server-sb.service - Open vSwitch database server for OVN Southbound database
     Loaded: loaded (/lib/systemd/system/ovn-ovsdb-server-sb.service; enabled; vendor preset: enabled)
     Active: active (running) since Sun 2022-05-29 21:25:01 +03; 16s ago
   Main PID: 7340 (ovn-ctl)
      Tasks: 4 (limit: 2218)
     Memory: 11.5M
        CPU: 1.226s
     CGroup: /system.slice/ovn-ovsdb-server-sb.service
             ├─7340 /bin/sh /usr/share/ovn/scripts/ovn-ctl run_sb_ovsdb --db-nb-create-insecure-remote=yes --db-sb-create-insecure-remote=yes --db-nb-addr=192.168.1.20 --db-sb-addr=192.168.1.20 --db-nb-cluster-local-addr=192.168.1.20 --d>
             └─7561 ovsdb-server -vconsole:off -vfile:info --log-file=/var/log/ovn/ovsdb-server-sb.log --remote=punix:/var/run/ovn/ovnsb_db.sock --pidfile=/var/run/ovn/ovnsb_db.pid --unixctl=/var/run/ovn/ovnsb_db.ctl --remote=db:OVN_Sout>

May 29 21:25:02 pinehost1 ovsdb-server[7561]: ovs|00002|raft|INFO|term 16: 5060662 ms timeout expired, starting election
May 29 21:25:02 pinehost1 ovsdb-server[7561]: ovs|00003|raft|INFO|term 16: elected leader by 1+ of 1 servers
May 29 21:25:02 pinehost1 ovsdb-server[7561]: ovs|00004|ovsdb_server|INFO|ovsdb-server (Open vSwitch) 2.15.0
May 29 21:25:03 pinehost1 ovn-ctl[7574]: 2022-05-29T18:25:03Z|00003|reconnect|INFO|unix:/var/run/ovn/ovnsb_db.sock: connecting...
May 29 21:25:03 pinehost1 ovn-ctl[7574]: 2022-05-29T18:25:03Z|00004|reconnect|INFO|unix:/var/run/ovn/ovnsb_db.sock: connected
May 29 21:25:03 pinehost1 ovsdb-client[7574]: ovs|00003|reconnect|INFO|unix:/var/run/ovn/ovnsb_db.sock: connecting...
May 29 21:25:03 pinehost1 ovsdb-client[7574]: ovs|00004|reconnect|INFO|unix:/var/run/ovn/ovnsb_db.sock: connected
May 29 21:25:03 pinehost1 ovn-ctl[7340]: Waiting for OVN_Southbound to come up.
May 29 21:25:12 pinehost1 ovsdb-server[7561]: ovs|00005|memory|INFO|17852 kB peak resident set size after 10.2 seconds
May 29 21:25:12 pinehost1 ovsdb-server[7561]: ovs|00006|memory|INFO|cells:13834 monitors:3 raft-log:175 sessions:2

● ovn-central.service - Open Virtual Network central components
     Loaded: loaded (/lib/systemd/system/ovn-central.service; enabled; vendor preset: enabled)
     Active: active (exited) since Sun 2022-05-29 21:25:01 +03; 16s ago
    Process: 7295 ExecStart=/bin/true (code=exited, status=0/SUCCESS)
   Main PID: 7295 (code=exited, status=0/SUCCESS)
        CPU: 13ms

May 29 21:25:01 pinehost1 systemd[1]: Starting Open Virtual Network central components...
May 29 21:25:01 pinehost1 systemd[1]: Finished Open Virtual Network central components.

● ovn-northd.service - Open Virtual Network central control daemon
     Loaded: loaded (/lib/systemd/system/ovn-northd.service; static)
     Active: active (running) since Sun 2022-05-29 21:25:02 +03; 15s ago
    Process: 7348 ExecStart=/usr/share/ovn/scripts/ovn-ctl start_northd --ovn-manage-ovsdb=no --no-monitor $OVN_CTL_OPTS (code=exited, status=0/SUCCESS)
   Main PID: 7569 (ovn-northd)
      Tasks: 1 (limit: 2218)
     Memory: 4.8M
        CPU: 656ms
     CGroup: /system.slice/ovn-northd.service
             └─7569 ovn-northd -vconsole:emer -vsyslog:err -vfile:info --ovnnb-db=tcp:192.168.1.20:6641,tcp:192.168.1.21:6641,tcp:192.168.1.22:6641 --ovnsb-db=tcp:192.168.1.20:6642,tcp:192.168.1.21:6642,tcp:192.168.1.22:6642 --no-chdir ->

May 29 21:25:01 pinehost1 systemd[1]: Starting Open Virtual Network central control daemon...
May 29 21:25:02 pinehost1 ovn-ctl[7348]: Starting ovn-northd.
May 29 21:25:02 pinehost1 systemd[1]: Started Open Virtual Network central control daemon.

● ovn-host.service - Open Virtual Network host components
     Loaded: loaded (/lib/systemd/system/ovn-host.service; enabled; vendor preset: enabled)
     Active: active (exited) since Sun 2022-05-29 21:25:01 +03; 16s ago
    Process: 7313 ExecStart=/bin/true (code=exited, status=0/SUCCESS)
   Main PID: 7313 (code=exited, status=0/SUCCESS)
        CPU: 12ms

May 29 21:25:01 pinehost1 systemd[1]: Starting Open Virtual Network host components...
May 29 21:25:01 pinehost1 systemd[1]: Finished Open Virtual Network host components.

● ovn-ovsdb-server-nb.service - Open vSwitch database server for OVN Northbound database
     Loaded: loaded (/lib/systemd/system/ovn-ovsdb-server-nb.service; enabled; vendor preset: enabled)
     Active: active (running) since Sun 2022-05-29 21:25:01 +03; 16s ago
   Main PID: 7336 (ovn-ctl)
      Tasks: 4 (limit: 2218)
     Memory: 3.2M
        CPU: 859ms
     CGroup: /system.slice/ovn-ovsdb-server-nb.service
             ├─7336 /bin/sh /usr/share/ovn/scripts/ovn-ctl run_nb_ovsdb --db-nb-create-insecure-remote=yes --db-sb-create-insecure-remote=yes --db-nb-addr=192.168.1.20 --db-sb-addr=192.168.1.20 --db-nb-cluster-local-addr=192.168.1.20 --d>
             └─7548 ovsdb-server -vconsole:off -vfile:info --log-file=/var/log/ovn/ovsdb-server-nb.log --remote=punix:/var/run/ovn/ovnnb_db.sock --pidfile=/var/run/ovn/ovnnb_db.pid --unixctl=/var/run/ovn/ovnnb_db.ctl --remote=db:OVN_Nort>

May 29 21:25:02 pinehost1 ovsdb-client[7572]: ovs|00001|reconnect|INFO|unix:/var/run/ovn/ovnnb_db.sock: connecting...
May 29 21:25:02 pinehost1 ovsdb-client[7572]: ovs|00002|reconnect|INFO|unix:/var/run/ovn/ovnnb_db.sock: connection attempt failed (No such file or directory)
May 29 21:25:02 pinehost1 ovsdb-server[7548]: ovs|00004|ovsdb_server|INFO|ovsdb-server (Open vSwitch) 2.15.0
May 29 21:25:03 pinehost1 ovn-ctl[7572]: 2022-05-29T18:25:03Z|00003|reconnect|INFO|unix:/var/run/ovn/ovnnb_db.sock: connecting...
May 29 21:25:03 pinehost1 ovn-ctl[7572]: 2022-05-29T18:25:03Z|00004|reconnect|INFO|unix:/var/run/ovn/ovnnb_db.sock: connected
May 29 21:25:03 pinehost1 ovsdb-client[7572]: ovs|00003|reconnect|INFO|unix:/var/run/ovn/ovnnb_db.sock: connecting...
May 29 21:25:03 pinehost1 ovsdb-client[7572]: ovs|00004|reconnect|INFO|unix:/var/run/ovn/ovnnb_db.sock: connected
May 29 21:25:03 pinehost1 ovn-ctl[7336]: Waiting for OVN_Northbound to come up.
May 29 21:25:12 pinehost1 ovsdb-server[7548]: ovs|00005|memory|INFO|9136 kB peak resident set size after 10.0 seconds
May 29 21:25:12 pinehost1 ovsdb-server[7548]: ovs|00006|memory|INFO|cells:1749 monitors:2 raft-log:215 sessions:3

● ovn-controller.service - Open Virtual Network host control daemon
     Loaded: loaded (/lib/systemd/system/ovn-controller.service; static)
     Active: active (running) since Sun 2022-05-29 21:25:01 +03; 15s ago
    Process: 7356 ExecStart=/usr/share/ovn/scripts/ovn-ctl start_controller --ovn-manage-ovsdb=no --no-monitor $OVN_CTL_OPTS (code=exited, status=0/SUCCESS)
   Main PID: 7452 (ovn-controller)
      Tasks: 4 (limit: 2218)
     Memory: 2.1M
        CPU: 321ms
     CGroup: /system.slice/ovn-controller.service
             └─7452 ovn-controller unix:/var/run/openvswitch/db.sock -vconsole:emer -vsyslog:err -vfile:info --no-chdir --log-file=/var/log/ovn/ovn-controller.log --pidfile=/var/run/ovn/ovn-controller.pid --detach

May 29 21:25:01 pinehost1 systemd[1]: Starting Open Virtual Network host control daemon...
May 29 21:25:01 pinehost1 ovn-ctl[7356]: Starting ovn-controller.
May 29 21:25:01 pinehost1 systemd[1]: Started Open Virtual Network host control daemon.

What do you see in /var/snap/lxd/common/lxd/logs/lxd.log?

Thanks @tomp for the reply, I just figure it out, the problem comes from the network interface with the IP definition, I defined bridge interface on top of the physical one and now everthing works flawlessly.
Regards.

1 Like

Hi,
I have same issue.
Heres my configuration:

root@c001-lxd-01:~# lxc config show
config:
cluster.https_address: 10.10.10.234:8443
core.https_address: 10.10.10.234:8443
core.trust_password: true
network.ovn.northbound_connection: tcp:10.10.10.234:6641,tcp:10.10.10.235:6641,tcp:10.10.10.236:6641

root@c001-lxd-01:~# ovn-sbctl show
Chassis “87a7bc18-522f-4290-9f78-4eb1341ea521”
hostname: c001-lxd-01
Encap geneve
ip: “10.10.10.234”
options: {csum=“true”}
Chassis “2a8f0227-cffd-4917-8128-71dbcb243c9a”
hostname: c001-lxd-03
Encap geneve
ip: “10.10.10.236”
options: {csum=“true”}
Chassis “85c55e3f-3d90-44b4-a486-a47c84c55a04”
hostname: c001-lxd-002
Encap geneve
ip: “10.10.10.235”
options: {csum=“true”}

root@c001-lxd-01:~# lxc network ls
±-------±---------±--------±-----±-----±------------±--------±------+
| NAME | TYPE | MANAGED | IPV4 | IPV6 | DESCRIPTION | USED BY | STATE |
±-------±---------±--------±-----±-----±------------±--------±------+
| br-int | bridge | NO | | | | 0 | |
±-------±---------±--------±-----±-----±------------±--------±------+
| eth0 | physical | NO | | | | 0 | |
±-------±---------±--------±-----±-----±------------±--------±------+
| eth1 | physical | NO | | | | 0 | |
±-------±---------±--------±-----±-----±------------±--------±------+

root@c001-lxd-01:~# lxc network show UPLINK
config:
dns.nameservers: 8.8.8.8
ipv4.gateway: 10.10.10.151/24
ipv4.ovn.ranges: 10.10.10.248-10.10.10.252
volatile.last_state.created: “false”
description: “”
name: UPLINK
type: physical
used_by: []
managed: true
status: Created
locations:

  • c001-lxd-01
  • 10.10.10.236
  • 10.10.10.235

heres my log:
root@c001-lxd-01:~# tail /var/snap/lxd/common/lxd/logs/lxd.log
time=“2022-06-10T10:32:33Z” level=warning msg=" - Couldn’t find the CGroup blkio.weight, disk priority will be ignored"
time=“2022-06-10T10:32:33Z” level=warning msg=" - Couldn’t find the CGroup memory swap accounting, swap limits will be ignored"
time=“2022-06-10T10:32:35Z” level=warning msg=“Dqlite: attempt 1: server 10.10.10.234:8443: no known leader”
time=“2022-06-10T10:32:37Z” level=warning msg=“Failed to initialize fanotify, falling back on fsnotify” err=“Failed to initialize fanotify: invalid argument”

did you make manual bridge for parent interface eth0 to solved this issue ?

sorry this is my network list :

root@c001-lxd-01:~# lxc network ls
±-------±---------±--------±---------------±--------------------------±------------±--------±--------+
| NAME | TYPE | MANAGED | IPV4 | IPV6 | DESCRIPTION | USED BY | STATE |
±-------±---------±--------±---------------±--------------------------±------------±--------±--------+
| UPLINK | physical | YES | | | | 0 | CREATED |
±-------±---------±--------±---------------±--------------------------±------------±--------±--------+
| br-int | bridge | NO | | | | 0 | |
±-------±---------±--------±---------------±--------------------------±------------±--------±--------+
| eth0 | physical | NO | | | | 1 | |
±-------±---------±--------±---------------±--------------------------±------------±--------±--------+
| eth1 | physical | NO | | | | 0 | |
±-------±---------±--------±---------------±--------------------------±------------±--------±--------+
| my-ovn | ovn | YES | 10.118.28.1/24 | fd42:3dcb:3597:63c2::1/64 | | 0 | ERRORED |
±-------±---------±--------±---------------±--------------------------±------------±--------±--------+

Please remove the network using lxc network delete and then re-add, when the command hangs, run ps aux | grep ovn-nbctl on the cluster member you’re targeting with the command and see if its hanging on communicating with the ovn database.

this is the process when hang
root@c001-lxd-01:~# ps aux | grep ovn-nbctl
root 5590 0.0 0.0 11756 6188 ? S Jun10 0:06 ovn-nbctl --db tcp:10.10.10.234:6641,tcp:10.10.10.235:6641,tcp:10.10.10.236:6641 --wait=sb ha-chassis-group-add lxd-net12
root 45949 0.0 0.0 6432 720 pts/1 S+ 14:56 0:00 grep --color=auto ovn-nbctl

any solution for this?

So this means OVN ovn-nbctl is hanging trying to connect to the OVN northbound or southbound database services.

Double check your connection string used in LXD’s network.ovn.northbound_connection setting, check the OVN databases are listing on those ports, and check there’s no firewall block those connections.

You should be able to manually test its working using the show command which just connections and pulls any existing configuration (maybe empty also):

ovn-nbctl --db tcp:10.10.10.234:6641,tcp:10.10.10.235:6641,tcp:10.10.10.236:6641 show

i run command ovn-nbctl --db tcp:10.10.10.234:6641,tcp:10.10.10.235:6641,tcp:10.10.10.236:6641 show
and i see empty result

i didn’t see port 6641 run in 10.10.10.235 and 10.10.10.236
i see the ovn run in unix socket mode
root@c001-lxd-03:~# ps ax |grep ovn
5569 ? S<sl 0:00 ovn-controller unix:/var/run/openvswitch/db.sock -vconsole:emer -vsyslog:err -vfile:info --no-chdir --log-file=/var/log/ovn/ovn-controller.log --pidfile=/var/run/ovn/ovn-controller.pid --detach
5842 pts/0 S+ 0:00 grep --color=auto ovn

so i change ovn-remote with this syntax

ovs-vsctl set open_vswitch . external_ids:ovn-encap-type=geneve external_ids:ovn-remote=“unix:/var/run/ovn/db.sock” external_ids:ovn-encap-ip=10.10.10.236

but still error and hang