I have looked at it briefly, and I haven’t found anything that clearly mismatches what I’m trying to do (apart from it using TLS), but obviously I’m missing something.
Here’s a short description of what I’m doing, if it’s any help:
# incus ls -c n,4,t
+--------+----------------------+-----------------+
| NAME | IPV4 | TYPE |
+--------+----------------------+-----------------+
| incus1 | 172.17.1.10 (enp6s0) | VIRTUAL-MACHINE |
| | 10.14.91.71 (enp5s0) | |
+--------+----------------------+-----------------+
| incus2 | 172.17.1.20 (enp6s0) | VIRTUAL-MACHINE |
| | 10.14.91.98 (enp5s0) | |
+--------+----------------------+-----------------+
| incus3 | 172.17.1.30 (enp6s0) | VIRTUAL-MACHINE |
| | 10.14.91.80 (enp5s0) | |
+--------+----------------------+-----------------+
These are using incusbr0
and incusbr1
, where the latter is the 172-ips on enp6s0
that I’m aiming to use for OVN. That one has a static IP definition via systemd-networkd
while the 10-ip comes from DHCP on incusbr0
.
So I set up the cluster (communicating over the 10-ips), then fill out /etc/default/ovn-central
on all nodes according to the docs, I then start the ovn-central
service on all nodes and set the open_vswitch
with encap-ip for each node.
I then create the UPLINK
network targeting all nodes, followed by setting the ovn.ranges
etc. This does not error, so I then set the northbound_connection
value for Incus and do network create my-ovn --type=ovn
At this point, enp6s0
that previously had a static 172-ip loses its ip, goes down, and the Incus command hangs. This is the last 50 syslog lines:
# journalctl --no-pager --since=-20m | tail -n 50
Oct 29 11:13:01 incus1 ovsdb-server[1811]: ovs|00028|reconnect|ERR|tcp:172.17.1.30:6644: no response to inactivity probe after 2 seconds, disconnecting
Oct 29 11:13:01 incus1 ovsdb-server[1811]: ovs|00029|reconnect|INFO|tcp:172.17.1.30:6644: connection dropped
Oct 29 11:13:01 incus1 ovsdb-server[1811]: ovs|00030|reconnect|ERR|tcp:172.17.1.20:50054: no response to inactivity probe after 2 seconds, disconnecting
Oct 29 11:13:01 incus1 ovsdb-server[1811]: ovs|00031|reconnect|ERR|tcp:172.17.1.30:48302: no response to inactivity probe after 2 seconds, disconnecting
Oct 29 11:13:01 incus1 ovsdb-server[1806]: ovs|00029|reconnect|ERR|tcp:172.17.1.20:48970: no response to inactivity probe after 2 seconds, disconnecting
Oct 29 11:13:01 incus1 ovsdb-server[1806]: ovs|00030|reconnect|ERR|tcp:172.17.1.30:43052: no response to inactivity probe after 2 seconds, disconnecting
Oct 29 11:13:01 incus1 ovsdb-server[1806]: ovs|00031|raft|INFO|term 3: 1957 ms timeout expired, starting election
Oct 29 11:13:02 incus1 ovsdb-server[1806]: ovs|00032|reconnect|INFO|tcp:172.17.1.20:6643: connecting...
Oct 29 11:13:02 incus1 ovsdb-server[1806]: ovs|00033|reconnect|INFO|tcp:172.17.1.20:6643: connected
Oct 29 11:13:02 incus1 ovsdb-server[1811]: ovs|00032|reconnect|INFO|tcp:172.17.1.20:6644: connecting...
Oct 29 11:13:02 incus1 ovsdb-server[1811]: ovs|00033|reconnect|INFO|tcp:172.17.1.20:6644: connected
Oct 29 11:13:02 incus1 ovsdb-server[1811]: ovs|00034|raft|INFO|rejecting term 2 < current term 3 received in append_request message from server 69d7
Oct 29 11:13:02 incus1 ovsdb-server[1811]: ovs|00035|reconnect|INFO|tcp:172.17.1.30:6644: connecting...
Oct 29 11:13:02 incus1 ovsdb-server[1806]: ovs|00034|reconnect|INFO|tcp:172.17.1.30:6643: connecting...
Oct 29 11:13:02 incus1 ovsdb-server[1811]: ovs|00036|reconnect|INFO|tcp:172.17.1.30:6644: connected
Oct 29 11:13:02 incus1 ovsdb-server[1806]: ovs|00035|reconnect|INFO|tcp:172.17.1.30:6643: connected
Oct 29 11:13:02 incus1 ovsdb-server[1806]: ovs|00036|raft|INFO|rejecting term 2 < current term 3 received in append_request message from server 91e9
Oct 29 11:13:02 incus1 ovsdb-server[1811]: ovs|00037|raft|INFO|term 4: 1478 ms timeout expired, starting election
Oct 29 11:13:02 incus1 ovsdb-server[1811]: ovs|00038|raft|INFO|rejecting term 2 < current term 4 received in vote_reply message from server 9d42
Oct 29 11:13:03 incus1 ovsdb-server[1806]: ovs|00037|reconnect|ERR|tcp:172.17.1.10:59148: no response to inactivity probe after 5 seconds, disconnecting
Oct 29 11:13:03 incus1 ovsdb-server[1806]: ovs|00038|raft|INFO|term 4: 1215 ms timeout expired, starting election
Oct 29 11:13:03 incus1 ovsdb-server[1806]: ovs|00039|raft|INFO|rejecting term 2 < current term 4 received in vote_reply message from server 1bf7
Oct 29 11:13:03 incus1 ovsdb-server[1811]: ovs|00039|raft|INFO|rejecting term 3 < current term 4 received in vote_request message from server 9d42
Oct 29 11:13:03 incus1 ovsdb-server[1811]: ovs|00040|raft|INFO|rejecting term 3 < current term 4 received in append_request message from server 9d42
Oct 29 11:13:03 incus1 ovsdb-server[1806]: ovs|00040|reconnect|ERR|tcp:172.17.1.30:43422: no response to inactivity probe after 5 seconds, disconnecting
Oct 29 11:13:03 incus1 ovsdb-server[1811]: ovs|00041|raft|INFO|term 5: 1007 ms timeout expired, starting election
Oct 29 11:13:03 incus1 ovsdb-server[1811]: ovs|00042|raft|INFO|rejecting term 3 < current term 5 received in vote_reply message from server 69d7
Oct 29 11:13:03 incus1 ovsdb-server[1806]: ovs|00041|raft|INFO|rejecting term 3 < current term 4 received in vote_request message from server 1bf7
Oct 29 11:13:03 incus1 ovsdb-server[1806]: ovs|00042|raft|INFO|rejecting term 3 < current term 4 received in append_request message from server 1bf7
Oct 29 11:13:04 incus1 ovsdb-server[1811]: ovs|00043|reconnect|ERR|tcp:172.17.1.10:58580: no response to inactivity probe after 5 seconds, disconnecting
Oct 29 11:13:04 incus1 ovsdb-server[1806]: ovs|00043|raft|INFO|term 5: 1098 ms timeout expired, starting election
Oct 29 11:13:04 incus1 ovsdb-server[1806]: ovs|00044|raft|INFO|rejecting term 3 < current term 5 received in vote_reply message from server 91e9
Oct 29 11:13:04 incus1 ovsdb-server[1811]: ovs|00044|raft|INFO|server 9d42 is leader for term 5
Oct 29 11:13:04 incus1 ovsdb-server[1811]: ovs|00045|raft|INFO|rejecting append_request because previous entry 3,26 not in local log (mismatch past end of log)
Oct 29 11:13:05 incus1 ovsdb-server[1811]: ovs|00046|reconnect|ERR|tcp:172.17.1.30:50286: no response to inactivity probe after 5 seconds, disconnecting
Oct 29 11:13:05 incus1 ovsdb-server[1811]: ovs|00047|reconnect|ERR|tcp:172.17.1.20:54780: no response to inactivity probe after 5 seconds, disconnecting
Oct 29 11:13:05 incus1 ovsdb-server[1806]: ovs|00045|reconnect|ERR|tcp:172.17.1.20:34860: no response to inactivity probe after 5 seconds, disconnecting
Oct 29 11:13:05 incus1 ovsdb-server[1806]: ovs|00046|reconnect|ERR|tcp:172.17.1.30:43428: no response to inactivity probe after 5 seconds, disconnecting
Oct 29 11:13:05 incus1 ovsdb-server[1811]: ovs|00048|reconnect|ERR|tcp:172.17.1.20:57364: no response to inactivity probe after 5 seconds, disconnecting
Oct 29 11:13:05 incus1 ovsdb-server[1811]: ovs|00049|reconnect|ERR|tcp:172.17.1.30:49656: no response to inactivity probe after 5 seconds, disconnecting
Oct 29 11:13:05 incus1 ovsdb-server[1811]: ovs|00050|reconnect|ERR|tcp:172.17.1.10:47574: no response to inactivity probe after 5 seconds, disconnecting
Oct 29 11:13:05 incus1 ovsdb-server[1806]: ovs|00047|raft|INFO|server 1bf7 is leader for term 5
Oct 29 11:13:05 incus1 ovsdb-server[1806]: ovs|00048|raft|INFO|rejecting append_request because previous entry 3,36 not in local log (mismatch past end of log)
Oct 29 11:13:06 incus1 ovsdb-server[1811]: ovs|00051|reconnect|ERR|tcp:172.17.1.30:58166: no response to inactivity probe after 5 seconds, disconnecting
Oct 29 11:13:07 incus1 ovsdb-server[1806]: ovs|00049|reconnect|ERR|tcp:172.17.1.20:46220: no response to inactivity probe after 5 seconds, disconnecting
Oct 29 11:13:07 incus1 ovsdb-server[1811]: ovs|00052|reconnect|ERR|tcp:172.17.1.20:54774: no response to inactivity probe after 5 seconds, disconnecting
Oct 29 11:13:07 incus1 ovn-northd[1784]: ovs|00046|reconnect|ERR|tcp:172.17.1.10:6641: no response to inactivity probe after 5 seconds, disconnecting
Oct 29 11:13:07 incus1 ovn-northd[1784]: ovs|00048|reconnect|ERR|tcp:172.17.1.10:6642: no response to inactivity probe after 5 seconds, disconnecting
Oct 29 11:13:07 incus1 ovsdb-server[1806]: ovs|00050|reconnect|ERR|tcp:172.17.1.10:59154: no response to inactivity probe after 5 seconds, disconnecting
Oct 29 11:13:07 incus1 ovn-controller[1235]: ovs|00470|reconnect|ERR|tcp:172.17.1.20:6642: no response to inactivity probe after 5 seconds, disconnecting
After 10 minutes, there are no logs following this and the incus command is still hung.
Despite enp6s0
being DOWN
and not having an ip at this point, I can still ping those 172-ips, maybe because it reaches the 10.14.91.1 gateway on the other interface or something, I’m not sure.