INCUS / OVN / IC problems with SSL

Hi, I finally managed to get a reproducible deployment for my cluster including OVN and an IC running over a TINC trunk, seemed to work quite well. (So cluster behind a NAT firewall [homelab] connected via IC to standalone Incus running as a gateway on Digital Ocean)

Only thing left to do seemed to be to add SSL … none of the Incus examples seem to mention this too much, but I think I managed to cobble together all the required settings.

After much experimenting, I finally now have my OVN and OVN-IC running, all the peers are up, but I’ve come unstuck somewhere.

ovn-*ctl, ovn-ic-*ctl now all hang in exactly the same way, despite the Raft peering all looking happy. For example;

ovn-ic-nbctl --db=ssl:192.168.234.10:6647 -p/etc/ovn/key.pem -c/etc/ovn/cert.pem -C/etc/ovn/ca.pem show

Hangs, and all I see in /var/log/ovn/ovsdb-server-ic-nb.log is;

raft|INFO|ssl:192.168.234.10:47684: ovsdb error: expecting notify RPC but received request

So I know the fundamental setup works fine without SSL.
The Raft stuff is all happy with the SSL.
The certificates and port number are all correct because otherwise that would be logging.
… it just hangs.

  * Cluster status for core (north)     : - ✔          :core: ovs-appctl -t /run/ovn/ovnnb_db.ctl cluster/status OVN_Northbound
  * Cluster status for core (south)     : - ✔          :core: ovs-appctl -t /run/ovn/ovnsb_db.ctl cluster/status OVN_Southbound
  * Cluster status for core (ic-north)  : - ✔          :core: ovs-appctl -t /run/ovn/ovn_ic_nb_db.ctl cluster/status OVN_IC_Northbound
  * Cluster status for core (ic-south)  : - ✔          :core: ovs-appctl -t /run/ovn/ovn_ic_sb_db.ctl cluster/status OVN_IC_Southbound

                                  Defined Status for Zone AZ_LOCAL                                   
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━┓
┃ Node Name ┃ North              ┃ South                  ┃ IC North           ┃ IC South           ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━┩
│ core      │ b069 LEADER        │ 2aef LEADER            │ 14d5 LEADER        │ cdd0 LEADER        │
│ grok      │ 2d71 Member 127 ms │ 6f76 Member 307 ms     │ 138d Member 281 ms │ 5cc0 Member 239 ms │
│ p400      │ cefe Member 126 ms │ e3f4 Member 4713396 ms │ 0940 Member 280 ms │ 96fe Member 238 ms │
└───────────┴────────────────────┴────────────────────────┴────────────────────┴────────────────────┘

Does anyone have any ideas as to what could cause ovn-nbctl et al to just hang with an “expecting notify” error … or is there anything obviously wrong with the config? I’m at a bit of a loss as to what to try next …

(this initially showed up as “incus network list” hanging, but tracking it back the underlying access to the database via SSL seems to be causing the incus hang)

/etc/default/ovn-central

VN_CTL_OPTS=" \
    --db-nb-cluster-local-addr=192.168.2.1\
    --db-sb-cluster-local-addr=192.168.2.1 \
    --db-nb-cluster-local-proto=ssl \
    --db-sb-cluster-local-proto=ssl \
    --db-nb-cluster-remote-proto=ssl \
    --db-sb-cluster-remote-proto=ssl \
    --ovn-northd-nb-db=ssl:192.168.2.1:6643,ssl:192.168.2.3:6643,ssl:192.168.2.4:6643 \
    --ovn-northd-sb-db=ssl:192.168.2.1:6644,ssl:192.168.2.3:6644,ssl:192.168.2.4:6644 \
    --ovn-controller-ssl-key=/etc/ovn/key.pem \
    --ovn-controller-ssl-cert=/etc/ovn/cert.pem \
    --ovn-controller-ssl-ca-cert=/etc/ovn/ca.pem \
    --ovn-northd-ssl-key=/etc/ovn/key.pem \
    --ovn-northd-ssl-cert=/etc/ovn/cert.pem \
    --ovn-northd-ssl-ca-cert=/etc/ovn/ca.pem \
    --ovn-nb-db-ssl-key=/etc/ovn/key.pem \
    --ovn-nb-db-ssl-cert=/etc/ovn/cert.pem \
    --ovn-nb-db-ssl-ca-cert=/etc/ovn/ca.pem \
    --ovn-sb-db-ssl-key=/etc/ovn/key.pem \
    --ovn-sb-db-ssl-cert=/etc/ovn/cert.pem \
    --ovn-sb-db-ssl-ca-cert=/etc/ovn/ca.pem \
    --ovn-ic-ssl-key=/etc/ovn/key.pem \
    --ovn-ic-ssl-cert=/etc/ovn/cert.pem \
    --ovn-ic-ssl-ca-cert=/etc/ovn/ca.pem \
    --ovn-ic-nb-db-ssl-key=/etc/ovn/key.pem \
    --ovn-ic-nb-db-ssl-cert=/etc/ovn/cert.pem \
    --ovn-ic-nb-db-ssl-ca-cert=/etc/ovn/ca.pem
    --ovn-ic-sb-db-ssl-key=/etc/ovn/key.pem \
    --ovn-ic-sb-db-ssl-cert=/etc/ovn/cert.pem \
    --ovn-ic-sb-db-ssl-ca-cert=/etc/ovn/ca.pem \
"

/etc/ovn/ovn-ic-db-params.conf

--ovnnb-db=ssl:192.168.2.1:6643,ssl:192.168.2.3:6643,ssl:192.168.2.4:6643
--ovnsb-db=ssl:192.168.2.1:6644,ssl:192.168.2.3:6644,ssl:192.168.2.4:6644
--ic-nb-db=ssl:192.168.234.10:6647,ssl:192.168.234.12:6647,ssl:192.168.234.14:6647
--ic-sb-db=ssl:192.168.234.10:6648,ssl:192.168.234.12:6648,ssl:192.168.234.14:6648
--private-key=/etc/ovn/key.pem
--certificate=/etc/ovn/cert.pem
--ca-cert=/etc/ovn/ca.pem

/etc/default/ovn-ic

OVN_CTL_OPTS=" \
    --db-ic-nb-addr=192.168.234.10 \
    --db-ic-sb-addr=192.168.234.10 \
    --db-ic-nb-cluster-local-addr=192.168.234.10 \
    --db-ic-sb-cluster-local-addr=192.168.234.10 \
    --db-ic-nb-cluster-local-proto=ssl \
    --db-ic-sb-cluster-local-proto=ssl \
    --ovn-northd-nb-db=ssl:192.168.2.1:6643,ssl:192.168.2.3:6643,ssl:192.168.2.4:6643 \
    --ovn-northd-sb-db=ssl:192.168.2.1:6644,ssl:192.168.2.3:6644,ssl:192.168.2.4:6644 \
    --db-ic-nb-cluster-remote-proto=ssl \
    --db-ic-sb-cluster-remote-proto=ssl \
    --ovn-ic-ssl-key=/etc/ovn/key.pem \
    --ovn-ic-ssl-cert=/etc/ovn/cert.pem \
    --ovn-ic-ssl-ca-cert=/etc/ovn/ca.pem \
    --ovn-ic-nb-db-ssl-key=/etc/ovn/key.pem \
    --ovn-ic-nb-db-ssl-cert=/etc/ovn/cert.pem \
    --ovn-ic-nb-db-ssl-ca-cert=/etc/ovn/ca.pem
    --ovn-ic-sb-db-ssl-key=/etc/ovn/key.pem \
    --ovn-ic-sb-db-ssl-cert=/etc/ovn/cert.pem \
    --ovn-ic-sb-db-ssl-ca-cert=/etc/ovn/ca.pem \
    --ovn-northd-ssl-key=/etc/ovn/key.pem \
    --ovn-northd-ssl-cert=/etc/ovn/cert.pem \
    --ovn-northd-ssl-ca-cert=/etc/ovn/ca.pem \
"

Just to clarify a little, it’s specifically the “list” operation that’s hanging. (or incus network list)
I can actually create and show an UPLINK;

root@grok:~# incus version
Client version: 6.12
Server version: 6.12
root@grok:~# incus network show UPLINK
config:
  ipv4.gateway: 192.168.1.254/24
  ipv4.ovn.ranges: 192.168.1.16-192.168.1.63
  volatile.last_state.created: "false"
description: ""
name: UPLINK
type: physical
used_by: []
managed: true
status: Created
locations:
- core
- p400
- grok
project: default

Would be good if you can add some more details about your hosts and network location. Without knowing this it is hard to give any kind of answer.

From reviewing the provided config details it seems like you have 3 nodes in your 192.168.2.x network (assume this is local?) and 3 other nodes in your 192.168.234.x network?
In case you only have 3 nodes in total and they can communicate over your VPN happily there is no need to setup IC at all. IC is only required if you want to connect 3 single nodes or 3 cluster at different locations.

Anyway to fully understand your question please provide some network and location details and where each host is located.

Hi, thanks, I will try to expand. My target is a local three node cluster, connected via IC to a remote 1-node cluster. The remote connection is over a TINC VPN which runs on the 192.168.234.x range. I have this configured and working nicely without SSL. I’m using my own Python based program to to produce a consistent install, it performs in a similar way to incus-deploy, it’s just raw python rather than Ansible. (I use it to do other stuff like Raft monitoring, ping checks etc, based on the deployment model)

In this instance my issue doesn’t involve the IC link, adding SSL is breaking my local three nodes, so I guess the IC could be ignored for now.

So what I have is three Raspberry Pi 5’s running Raspberry Pi OS.
Each one is using a custom kernel in order to get the “geneve” kernel module.
Each machine runs DHCP against a local NAT router on eth0.
Each machine has a USB eth port on eth1 (static address).
The cluster is configured on 192.168.2.x, which is set up statically on BR1, which sits on eth1.
Internet connectivity comes from BR0, which sits on eth0, which gets 192.168.1.x from a DHCP NAT router.
The Cluster IC, which should be unused in context, sits on 192.168.234.x [br0/eth0]

So when running, I have a 3-node Raft for core, grok and p400 with northbound and southbound all running on 192.168.2.x. I then have another for IC running on 192.168.234.x. Both seem to work under SSL, but both fail in the same way in terms of the “show” function failing. i.e. ovn-nbctl show and ovn-ic-nbctl show. If I call ovn-nbctl directly against the database it works (no output because it’s empty), but if I call it with a northbound IP list, it hangs and logs the expecting notify message in the logs. (which in turn causes “incus network list” to hang)

It feels like I’m misunderstanding something in terms of how keys should be used, but I’ve tried using the wrong keys or mismatching keys, and it does log errors as I would expect. What’s killing me here is the lack of any “error”, or at least an error that means something to me.

After posting this I went back to the SSL docs and re-wrote the automatic key deployment from scratch … ended up with exactly the same result / problem.

If it helps, my entire setup is driven by this;

nodes = NodeCollection ()
clusters = ClusterCollection ()
interconnects = InterconnectCollection ()

nodes.add ('grok',Node (name='grok',address='192.168.2.4',icaddress='192.168.234.14'))
nodes.add ('core',Node (name='core',address='192.168.2.1',icaddress='192.168.234.10'))
nodes.add ('p400',Node (name='p400',address='192.168.2.3',icaddress='192.168.234.12'))
nodes.add ('ovn', Node (name='ovn' ,address='10.131.0.4', icaddress='192.168.234.2', core_ic=False,listen='127.0.0.1',bridge='br1'))
clusters.add ('az_local',Cluster (
        zone='az_local',
        networks=[{'name':'public', 'cidr':'10.4.0.1/22'}],
        range='192.168.1.16-192.168.1.63',
        gateway='192.168.1.254/24',
        nodes={'core': nodes.by_name('core'), 'grok': nodes.by_name('grok'), 'p400': nodes.by_name('p400')}
    ))
clusters.add ('az_do', Cluster (
        zone='az_do',
        networks=[{'name':'public','cidr':'10.101.0.1/24'}],
        range='10.131.0.240-10.131.0.247',
        gateway='10.131.0.4/24',
        nodes={'ovn': nodes.by_name('ovn')}
    ))
interconnects.add ('live', [clusters.by_name('az_local'), clusters.by_name('az_do')])

My hope is to be able to produce a Python program that can problematically deploy an entire cluster, update it, then add and remove interconnects on the fly. I’m sort of there to an extent but I’m not sure how useful it’s going to be without SSL.

$ ./ccmd.py -nic

                            Defined Cluster Nodes                             
┏━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━┳━━━━━━━━┳━━━━━━━━━━━┓
┃ Name ┃     Address ┃     IC Address ┃ Bridge ┃ CoreDB ┃ CoreIC ┃    Listen ┃
┡━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━╇━━━━━━━━╇━━━━━━━━━━━┩
│ grok │ 192.168.2.4 │ 192.168.234.14 │ br0    │ True   │ True   │           │
│ core │ 192.168.2.1 │ 192.168.234.10 │ br0    │ True   │ True   │           │
│ p400 │ 192.168.2.3 │ 192.168.234.12 │ br0    │ True   │ True   │           │
│ ovn  │  10.131.0.4 │  192.168.234.2 │ br1    │ True   │ False  │ 127.0.0.1 │
└──────┴─────────────┴────────────────┴────────┴────────┴────────┴───────────┘

                                   Defined Clusters                                   
┏━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Zone     ┃                     Range ┃          Gateway ┃ Net name ┃      Net CIDR ┃
┡━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ az_local │ 192.168.1.16-192.168.1.63 │ 192.168.1.254/24 │ public   │   10.4.0.1/22 │
│ az_do    │ 10.131.0.240-10.131.0.247 │    10.131.0.4/24 │ public   │ 10.101.0.1/24 │
└──────────┴───────────────────────────┴──────────────────┴──────────┴───────────────┘

        Defined InterConnects        
┏━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━┓
┃ Label ┃ Cluster # 1 ┃ Cluster # 2 ┃
┡━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━┩
│ live  │ az_local    │ az_do       │
└───────┴─────────────┴─────────────┘

 $ ./ccmd.py -P live

* Checking connectivity live
  * ping from core to grok              : - ✔  0.23 ms :core: ping -c1 192.168.2.4
  * ping from core to p400              : - ✔  0.49 ms :core: ping -c1 192.168.2.3
  * ping from grok to core              : - ✔  0.26 ms :grok: ping -c1 192.168.2.1
  * ping from grok to p400              : - ✔  0.33 ms :grok: ping -c1 192.168.2.3
  * ping from p400 to core              : - ✔  0.29 ms :p400: ping -c1 192.168.2.1
  * ping from p400 to grok              : - ✔  0.28 ms :p400: ping -c1 192.168.2.4
  * ping from core to grok              : - ✔  0.49 ms :core: ping -c1 192.168.234.14
  * ping from core to p400              : - ✔  0.55 ms :core: ping -c1 192.168.234.12
  * ping from core to ovn               : - ✔ 11.10 ms :core: ping -c1 192.168.234.2
  * ping from grok to core              : - ✔  0.35 ms :grok: ping -c1 192.168.234.10
  * ping from grok to p400              : - ✔  0.53 ms :grok: ping -c1 192.168.234.12
  * ping from grok to ovn               : - ✔ 10.30 ms :grok: ping -c1 192.168.234.2
  * ping from p400 to core              : - ✔  0.54 ms :p400: ping -c1 192.168.234.10
  * ping from p400 to grok              : - ✔  0.55 ms :p400: ping -c1 192.168.234.14
  * ping from p400 to ovn               : - ✔  8.20 ms :p400: ping -c1 192.168.234.2
  * ping from ovn to core               : - ✔  9.39 ms :ovn: ping -c1 192.168.234.10
  * ping from ovn to grok               : - ✔  9.76 ms :ovn: ping -c1 192.168.234.14
  * ping from ovn to p400               : - ✔  8.37 ms :ovn: ping -c1 192.168.234.12
* Ping test OK

Little bit more detail;

# ovn-nbctl --no-leader-only --db=ssl:192.168.2.1:6643 show
2025-05-02T09:48:59Z|00001|stream_ssl|ERR|Private key must be configured to use SSL
2025-05-02T09:48:59Z|00002|stream_ssl|ERR|Certificate must be configured to use SSL
2025-05-02T09:48:59Z|00003|stream_ssl|ERR|CA certificate must be configured to use SSL
ovn-nbctl: ssl:192.168.2.1:6643: database connection failed (Protocol not available)

As expected;

  • It IS expecting SSL
  • SSL is set up on port 6643
  • It is trying to read SSL certs

If I add certs to the call;

# ovn-nbctl --no-leader-only --db=ssl:192.168.2.1:6643 -p /etc/openvswitch/controller-privkey.pem -c /etc/openvswitch/controller-cert.pem -C /var/lib/openvswitch/pki/controllerca/cacert.pem show
^C2025-05-02T09:48:16Z|00001|fatal_signal|WARN|terminating with signal 2 (Interrupt)
  • Hangs (subject to CTRL+C)
  • Appears to be reading and approving certs
  • Logs;
2025-05-02T09:46:57.137Z|00015|raft|INFO|ssl:192.168.2.1:50644: ovsdb error: expecting notify RPC but received request

Control+C then logs;

2025-05-02T09:48:16.148Z|00020|stream_ssl|WARN|SSL_read: error:0A000126:SSL routines::unexpected eof while reading
2025-05-02T09:48:16.148Z|00021|jsonrpc|WARN|ssl:192.168.2.1:56158: receive error: Input/output error
2025-05-02T09:48:16.148Z|00022|reconnect|WARN|ssl:192.168.2.1:56158: connection dropped (Input/output error)

If I call “ovn-nbctl” with -v I get;

# ovn-nbctl -v --no-leader-only --db=ssl:192.168.2.1:6643 -p /etc/openvswitch/controller-privkey.pem -c /etc/openvswitch/controller-cert.pem -C /var/lib/openvswitch/pki/controllerca/cacert.pem show
2025-05-02T09:53:24Z|00001|reconnect|DBG|ssl:192.168.2.1:6643: entering BACKOFF
2025-05-02T09:53:24Z|00002|ovn_dbctl|DBG|Called as ovn-nbctl -v --no-leader-only --db=ssl:192.168.2.1:6643 -p /etc/openvswitch/controller-privkey.pem -c /etc/openvswitch/controller-cert.pem -C /var/lib/openvswitch/pki/controllerca/cacert.pem show
2025-05-02T09:53:24Z|00003|reconnect|INFO|ssl:192.168.2.1:6643: connecting...
2025-05-02T09:53:24Z|00004|reconnect|DBG|ssl:192.168.2.1:6643: entering CONNECTING
2025-05-02T09:53:24Z|00005|ovsdb_cs|DBG|ssl:192.168.2.1:6643: SERVER_SCHEMA_REQUESTED -> SERVER_SCHEMA_REQUESTED at lib/ovsdb-cs.c:423
2025-05-02T09:53:24Z|00006|stream_ssl|DBG|client0-->ssl:192.168.2.1:6643 type 256 (5 bytes)
2025-05-02T09:53:24Z|00007|stream_ssl|DBG|client0-->ssl:192.168.2.1:6643 handshake: client_hello (380 bytes)
2025-05-02T09:53:24Z|00008|poll_loop|DBG|wakeup due to [POLLIN] on fd 4 (192.168.2.1:43796<->192.168.2.1:6643) at lib/stream-ssl.c:830
2025-05-02T09:53:24Z|00009|stream_ssl|DBG|client0<--ssl:192.168.2.1:6643 type 256 (5 bytes)
2025-05-02T09:53:24Z|00010|stream_ssl|DBG|client0<--ssl:192.168.2.1:6643 handshake: server_hello (122 bytes)
2025-05-02T09:53:24Z|00011|stream_ssl|DBG|client0<--ssl:192.168.2.1:6643 type 256 (5 bytes)
2025-05-02T09:53:24Z|00012|stream_ssl|DBG|client0<--ssl:192.168.2.1:6643 type 256 (5 bytes)
2025-05-02T09:53:24Z|00013|stream_ssl|DBG|client0<--ssl:192.168.2.1:6643 type 257 (1 bytes)
2025-05-02T09:53:24Z|00014|stream_ssl|DBG|client0<--ssl:192.168.2.1:6643 handshake: <unknown> (6 bytes)
2025-05-02T09:53:24Z|00015|stream_ssl|DBG|client0<--ssl:192.168.2.1:6643 type 256 (5 bytes)
2025-05-02T09:53:24Z|00016|stream_ssl|DBG|client0<--ssl:192.168.2.1:6643 type 257 (1 bytes)
2025-05-02T09:53:24Z|00017|stream_ssl|DBG|client0<--ssl:192.168.2.1:6643 handshake: certificate_request (193 bytes)
2025-05-02T09:53:24Z|00018|stream_ssl|DBG|client0<--ssl:192.168.2.1:6643 type 256 (5 bytes)
2025-05-02T09:53:24Z|00019|stream_ssl|DBG|client0<--ssl:192.168.2.1:6643 type 257 (1 bytes)
2025-05-02T09:53:24Z|00020|stream_ssl|DBG|client0<--ssl:192.168.2.1:6643 handshake: certificate (1994 bytes)
2025-05-02T09:53:24Z|00021|stream_ssl|DBG|client0<--ssl:192.168.2.1:6643 type 256 (5 bytes)
2025-05-02T09:53:24Z|00022|stream_ssl|DBG|client0<--ssl:192.168.2.1:6643 type 257 (1 bytes)
2025-05-02T09:53:24Z|00023|stream_ssl|DBG|client0<--ssl:192.168.2.1:6643 handshake: certificate_verify (264 bytes)
2025-05-02T09:53:24Z|00024|stream_ssl|DBG|client0<--ssl:192.168.2.1:6643 type 256 (5 bytes)
2025-05-02T09:53:24Z|00025|stream_ssl|DBG|client0<--ssl:192.168.2.1:6643 type 257 (1 bytes)
2025-05-02T09:53:24Z|00026|stream_ssl|DBG|client0<--ssl:192.168.2.1:6643 handshake: finished (52 bytes)
2025-05-02T09:53:24Z|00027|stream_ssl|DBG|client0-->ssl:192.168.2.1:6643 type 256 (5 bytes)
2025-05-02T09:53:24Z|00028|stream_ssl|DBG|client0-->ssl:192.168.2.1:6643 change_cipher_spec (1 bytes)
2025-05-02T09:53:24Z|00029|stream_ssl|DBG|client0-->ssl:192.168.2.1:6643 type 256 (5 bytes)
2025-05-02T09:53:24Z|00030|stream_ssl|DBG|client0-->ssl:192.168.2.1:6643 type 257 (1 bytes)
2025-05-02T09:53:24Z|00031|stream_ssl|DBG|client0-->ssl:192.168.2.1:6643 handshake: certificate (1994 bytes)
2025-05-02T09:53:24Z|00032|stream_ssl|DBG|client0-->ssl:192.168.2.1:6643 type 256 (5 bytes)
2025-05-02T09:53:24Z|00033|stream_ssl|DBG|client0-->ssl:192.168.2.1:6643 type 257 (1 bytes)
2025-05-02T09:53:24Z|00034|stream_ssl|DBG|client0-->ssl:192.168.2.1:6643 handshake: certificate_verify (264 bytes)
2025-05-02T09:53:24Z|00035|stream_ssl|DBG|client0-->ssl:192.168.2.1:6643 type 256 (5 bytes)
2025-05-02T09:53:24Z|00036|stream_ssl|DBG|client0-->ssl:192.168.2.1:6643 type 257 (1 bytes)
2025-05-02T09:53:24Z|00037|stream_ssl|DBG|client0-->ssl:192.168.2.1:6643 handshake: finished (52 bytes)
2025-05-02T09:53:24Z|00038|reconnect|INFO|ssl:192.168.2.1:6643: connected
2025-05-02T09:53:24Z|00039|reconnect|DBG|ssl:192.168.2.1:6643: entering ACTIVE
2025-05-02T09:53:24Z|00040|jsonrpc|DBG|ssl:192.168.2.1:6643: send request, method="get_schema", params=["_Server"], id=1
2025-05-02T09:53:24Z|00041|stream_ssl|DBG|client0-->ssl:192.168.2.1:6643 type 256 (5 bytes)
2025-05-02T09:53:24Z|00042|stream_ssl|DBG|client0-->ssl:192.168.2.1:6643 type 257 (1 bytes)
2025-05-02T09:53:24Z|00043|ovsdb_cs|DBG|ssl:192.168.2.1:6643: SERVER_SCHEMA_REQUESTED -> SERVER_SCHEMA_REQUESTED at lib/ovsdb-cs.c:423
2025-05-02T09:53:24Z|00044|poll_loop|DBG|wakeup due to [POLLIN] on fd 4 (192.168.2.1:43796<->192.168.2.1:6643) at lib/stream-ssl.c:842
2025-05-02T09:53:24Z|00045|stream_ssl|DBG|client0<--ssl:192.168.2.1:6643 type 256 (5 bytes)
2025-05-02T09:53:24Z|00046|stream_ssl|DBG|client0<--ssl:192.168.2.1:6643 type 257 (1 bytes)
2025-05-02T09:53:24Z|00047|stream_ssl|DBG|client0<--ssl:192.168.2.1:6643 handshake: <unknown> (1241 bytes)
2025-05-02T09:53:24Z|00048|poll_loop|DBG|wakeup due to [POLLIN] on fd 4 (192.168.2.1:43796<->192.168.2.1:6643) at lib/stream-ssl.c:842
2025-05-02T09:53:24Z|00049|stream_ssl|DBG|client0<--ssl:192.168.2.1:6643 type 256 (5 bytes)
2025-05-02T09:53:24Z|00050|stream_ssl|DBG|client0<--ssl:192.168.2.1:6643 type 257 (1 bytes)
2025-05-02T09:53:24Z|00051|stream_ssl|DBG|client0<--ssl:192.168.2.1:6643 handshake: <unknown> (1241 bytes)
... etc ...

Ok, so I’m getting very close. My issues seem to have revolved around the “insecure” parameter and an mis-understanding of how all the ports fit together. Notably for SSL making use of “set-connection” and “set-ssl” seems critical and environmental overrides seem to break things very easily.

So far as i can see it’s all working down to the very last command, and the error is clear - it’s just not making any sense at this point.

This is my error message

# incus network peer create public ic-live-public-core ic-live-public-core --type=remote
Error: Failed creating peer: OVN IC Northbound database is configured to use SSL but no client certificate was found

BUT, the message seems to be coming from ovn and;

# ovn-ic-nbctl --db ssl:192.168.2.1:6645,ssl:192.168.2.3:6645,ssl:192.168.2.4:6645 get-ssl
Private key: /etc/openvswitch/sc-privkey.pem
Certificate: /etc/openvswitch/sc-cert.pem
CA Certificate: /var/lib/openvswitch/pki/switchca/cacert.pem
Bootstrap: false

… and every preceding command seemed to work Ok …

# incus network list
+----------------+----------+---------+-------------+---------------------------+-------------+---------+---------+
|      NAME      |   TYPE   | MANAGED |    IPV4     |           IPV6            | DESCRIPTION | USED BY |  STATE  |
+----------------+----------+---------+-------------+---------------------------+-------------+---------+---------+
| UPLINK         | physical | YES     |             |                           |             | 1       | CREATED |
+----------------+----------+---------+-------------+---------------------------+-------------+---------+---------+
| br0            | bridge   | NO      |             |                           |             | 2       |         |
+----------------+----------+---------+-------------+---------------------------+-------------+---------+---------+
| br1            | bridge   | NO      |             |                           |             | 0       |         |
+----------------+----------+---------+-------------+---------------------------+-------------+---------+---------+
| br-int         | bridge   | NO      |             |                           |             | 0       |         |
+----------------+----------+---------+-------------+---------------------------+-------------+---------+---------+
| eth0           | physical | NO      |             |                           |             | 0       |         |
+----------------+----------+---------+-------------+---------------------------+-------------+---------+---------+
| eth1           | physical | NO      |             |                           |             | 0       |         |
+----------------+----------+---------+-------------+---------------------------+-------------+---------+---------+
| lo             | loopback | NO      |             |                           |             | 0       |         |
+----------------+----------+---------+-------------+---------------------------+-------------+---------+---------+
| ovn            | unknown  | NO      |             |                           |             | 0       |         |
+----------------+----------+---------+-------------+---------------------------+-------------+---------+---------+
| public         | ovn      | YES     | 10.4.0.1/22 | fd42:590e:c2ee:fc5c::1/64 |             | 0       | CREATED 
+----------------+----------+---------+-------------+---------------------------+-------------+---------+---------+

# incus network integration list
+---------------------+-------------+------+---------+
|        NAME         | DESCRIPTION | TYPE | USED BY |
+---------------------+-------------+------+---------+
| ic-live-public-core |             | ovn  | 0       |
+---------------------+-------------+------+---------+
| ic-live-public-grok |             | ovn  | 0       |
+---------------------+-------------+------+---------+
| ic-live-public-p400 |             | ovn  | 0       |
+---------------------+-------------+------+---------+

# incus config get network.ovn.northbound_connection
ssl:192.168.2.1:6641,ssl:192.168.2.3:6641,ssl:192.168.2.4:6641

Any ideas?
It kinda feels like “peer create” isn’t expecting SSL and isn’t dealing with the certs?
(Please excuse the question, but the examples I’ve seen only seem to be using “tcp”, does anyone have this working with full SSL?)

Ports
6641 north
6642 south
6643 north raft
6644 south raft
6645 north ic
6646 south ic
6647 north ic raft
6648 south ic raft
All running, all SSL.

Everything looks happy / in-sync, no errors in any of the log files, “peer create” doesn’t generate any logging …

                  Defined Status for Zone AZ_LOCAL (Northbound)                   
┏━━━━━━┳━━━━━━┳━━━━━━━━━━━━━┳━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━┓
┃ Node ┃ Id   ┃ Address     ┃ Port ┃ Status         ┃ Role     ┃ Connect ┃ B/Log ┃
┡━━━━━━╇━━━━━━╇━━━━━━━━━━━━━╇━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━╇━━━━━━━┩
│ core │ 0063 │ 192.168.2.1 │ 6643 │ cluster member │ leader   │       2 │   0/0 │
│ grok │ 3d3a │ 192.168.2.4 │ 6643 │ cluster member │ follower │       2 │   0/0 │
│ p400 │ 47c4 │ 192.168.2.3 │ 6643 │ cluster member │ follower │       2 │   0/0 │
└──────┴──────┴─────────────┴──────┴────────────────┴──────────┴─────────┴───────┘

                  Defined Status for Zone AZ_LOCAL (Southbound)                   
┏━━━━━━┳━━━━━━┳━━━━━━━━━━━━━┳━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━┓
┃ Node ┃ Id   ┃ Address     ┃ Port ┃ Status         ┃ Role     ┃ Connect ┃ B/Log ┃
┡━━━━━━╇━━━━━━╇━━━━━━━━━━━━━╇━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━╇━━━━━━━┩
│ core │ c152 │ 192.168.2.1 │ 6644 │ cluster member │ follower │       2 │   0/0 │
│ grok │ 5ab2 │ 192.168.2.4 │ 6644 │ cluster member │ leader   │       2 │   0/0 │
│ p400 │ 56b2 │ 192.168.2.3 │ 6644 │ cluster member │ follower │       2 │   0/0 │
└──────┴──────┴─────────────┴──────┴────────────────┴──────────┴─────────┴───────┘

                  Defined Status for Zone AZ_LOCAL (IC Northbound)                   
┏━━━━━━┳━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━┓
┃ Node ┃ Id   ┃ Address        ┃ Port ┃ Status         ┃ Role     ┃ Connect ┃ B/Log ┃
┡━━━━━━╇━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━╇━━━━━━━┩
│ core │ e548 │ 192.168.234.10 │ 6647 │ cluster member │ follower │       2 │   0/0 │
│ grok │ 9d90 │ 192.168.234.14 │ 6647 │ cluster member │ leader   │       2 │   0/0 │
│ p400 │ b539 │ 192.168.234.12 │ 6647 │ cluster member │ follower │       2 │   0/0 │
└──────┴──────┴────────────────┴──────┴────────────────┴──────────┴─────────┴───────┘

                  Defined Status for Zone AZ_LOCAL (IC Southbound)                   
┏━━━━━━┳━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━┓
┃ Node ┃ Id   ┃ Address        ┃ Port ┃ Status         ┃ Role     ┃ Connect ┃ B/Log ┃
┡━━━━━━╇━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━╇━━━━━━━┩
│ core │ 6fe1 │ 192.168.234.10 │ 6648 │ cluster member │ leader   │       2 │   0/0 │
│ grok │ 4286 │ 192.168.234.14 │ 6648 │ cluster member │ follower │       2 │   0/0 │
│ p400 │ 689e │ 192.168.234.12 │ 6648 │ cluster member │ follower │       2 │   0/0 │
└──────┴──────┴────────────────┴──────┴────────────────┴──────────┴─────────┴───────┘

Looking at the INCUS source;

	// Get ICNB.
	icnb, err := networkOVN.NewICNB(integration.Config["ovn.northbound_connection"], integration.Config["ovn.ca_cert"], integration.Config["ovn.client_cert"], integration.Config["ovn.client_key"])
	if err != nil {
		return err
	}

And my config;

# incus config show|grep ".ovn."
  network.ovn.ca_cert: "Certificate:\n    Data:\n        Version: 3 (0x2)\n        Serial
  network.ovn.client_cert: "Certificate:\n    Data:\n        Version: 3 (0x2)\n        Serial
  network.ovn.client_key: |-
  network.ovn.northbound_connection: ssl:192.168.2.1:6641,ssl:192.168.2.3:6641,ssl:192.168.2.4:6641

I would only appear to generate that message if it can’t find “ca-cert” “client_cert” or “client_key” in the config, but in the config they appear to be present … what am I missing?

ca_cert, client_cert and client_key should be x509 PEM encoded certificates and keys, so should look roughly the same as ~/.config/incus/client.crt and ~/.config/incus/client.key

What you’re showing above definitely isn’t the correct format.

Ok, many thanks that sounds like an excellent lead, was following the OVN PKI instructions and the .pem files generated, I will go back and see where I went wrong … :slight_smile:

Ok, so on reflection, as it was apparently loading (and using) the keys in that unencoded PEM format, then maybe it was happy with the format … however, I’ve now x509 encoded the CA and CERT files (PRIV was Ok) and reloaded and re-deployed, getting the same error, but this is how the config looks now;

# incus config show|grep ".ovn."
  network.ovn.ca_cert: |-
  network.ovn.client_cert: |-
  network.ovn.client_key: |-
  network.ovn.northbound_connection: ssl:192.168.2.1:6641,ssl:192.168.2.3:6641,ssl:192.168.2.4:6641

Without the grep is shows the keys in encoded format as per the INCUS keys you pointed me at.
Looking back at the code, it would appear the error is because it’s failing (in this instance) to load the certificate, or at least it’s coming back with an empty certificate … is there any way to get incus to validate or confirm it’s happy with the certificates it’s been given, just to rule the formatting in or out?

To get from the PEM (out put from the ovn pki routines) to what I’m using, all I did was;

openssl x509 -in cacert.pem -o cacert.crt
incus config set network.ovn.ca_cert="$(cat cacert.crt)"

Does this look Ok?

I’m using certificates generated as per the documentation here;

https://docs.openvswitch.org/en/latest/howto/ssl/

Is there anything I need to do (other than “openssl x509”) to convert certificates generated in this way to a format that INCUS wants?

Not sure what to make of this.

# incus network peer create public ic-live-public-core ic-live-public-core --type=remote
Error: Failed creating peer: OVN IC Northbound database is configured to use SSL but no client certificate was found

The certificates are correct. I now have only one set of certificates (ca, cert, key) used for everything, automatically deployed, programatically double verified, just to avoid any confusion. North and South and North and South IC are all running SSL and the Rafts are happy.

INCUS appears to have successfully used the keys it’s been given to configure the north and southbound databases, but seems unable to use the key to configure the IC northbound.

Any ideas?

Ok scratch that, found it. Hadn’t realised keys needed to be set in config “and” network integration … on to the next issue … :slight_smile: