OVN Network - No IPs for Instances?

I’ve set up an OVN network based on these directions and I also followed the YT video for “OVN and a LXD cluster”.

I’m running this in a cluster (virtualized for testing) of 3 machines. Each host has a br0 bridge interface. The directions are very descriptive and I followed them with seemingly no issues. I created the UPLINK network and all went fine and then created the “my-ovn” network also with no errors.

The network show for my-ovn looks fine to me:

config:
  bridge.mtu: "1442"
  ipv4.address: 10.127.3.1/24
  ipv4.nat: "true"
  network: UPLINK
  volatile.network.ipv4.address: 192.168.122.200
description: ""
name: my-ovn
type: ovn
used_by:
- /1.0/instances/u1
managed: true
status: Created
locations:
- incus-node2
- incus-node3
- incus-node1

The only diversion from the directions was the ipv6 settings. Currently ipv6 is disabled on all 3 hosts so I omitted that from the creation of the UPLINK:

incus network create UPLINK --type=physical \
   ipv4.ovn.ranges=192.168.122.200-192.168.122.254 \
   ipv4.gateway=192.168.122.1/24 \
   dns.nameservers=192.168.122.1

I’m not seeing errors in the northbound or southbound ovn logs. When I create an instance with the “my-ovn” network it never gets an IP address. Any ideas what I may be missing?

Thanks!

EDIT: One quick addition - if I tail the ovn-controller.log on the first host I see this when I create a new instance with the my-ovn network:

2024-02-09T18:25:09.168Z|00096|binding|INFO|Claiming lport incus-net8-instance-f0111668-b866-4a6b-b29c-ad158f755f7e-eth0 for this chassis.
2024-02-09T18:25:09.169Z|00097|binding|INFO|incus-net8-instance-f0111668-b866-4a6b-b29c-ad158f755f7e-eth0: Claiming 00:16:3e:20:33:97 dynamic
2024-02-09T18:25:09.171Z|00098|binding|INFO|Setting lport incus-net8-instance-f0111668-b866-4a6b-b29c-ad158f755f7e-eth0 ovn-installed in OVS
2024-02-09T18:25:09.180Z|00099|binding|INFO|Setting lport incus-net8-instance-f0111668-b866-4a6b-b29c-ad158f755f7e-eth0 up in Southbound

Checking the onv-northd.log it even shows this when I create the new instance:

2024-02-09T18:25:09.003Z|00079|northd|INFO|Assigned dynamic IPv4 address '10.127.3.2' to port 'incus-net8-instance-f0111668-b866-4a6b-b29c-ad158f755f7e-eth0'

Okay, so there are a few things to check:

  • ovs-vsctl show on each host should show two geneve tunnels headed to the other two hosts, those tunnels should be in working state
  • Try pinging 192.168.122.200 to make sure that the virtual router at least behaves correctly
  • Make sure you don’t have Docker or anything else that may interfere with firewalling on your hosts

I’m seeing the 2 other machines IPs when running ovs-vsctl show on each host but their state is listed as down.

Port ovn-538c72-0
    Interface ovn-538c72-0
        type: geneve
        options: {csum="true", key=flow, remote_ip="192.168.122.101"}
        bfd_status: {diagnostic="No Diagnostic", flap_count="0", forwarding="false", remote_diagnostic="No Diagnostic", remote_state=down, state=down}

No luck. Apparently the virtual router is not behaving correctly!

Nope. The hosts are fairly minimal Debian 12 installs.

Right, so it’s showing state=down and remote_state=down which suggests that your hosts are unable to communicate for the Geneve tunnels.

I’d first make sure that you can ping the remote_ip listed in ovs-vsctl, if you can, then please look for firewalls on all affected machines as you may have something in place which prevents the tunnel traffic from going out or coming in.

Yep the hosts can all ping each other. This is in a temp VM environment that I’m testing with where all hosts are running via qemu. They’re running a basic Debian 12 install and afaik that doesn’t ship with a firewall installed.

Could it be that I’m creating the UPLINK incorrectly? Based on the docs:

you must specify either an unmanaged bridge interface or an unused physical interface as the parent for the physical network that is used for OVN uplink. The instructions assume that you are using a manually created unmanaged bridge.

The network on the hosts is using a bridge interface br0 as follows:

2: enp1s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master br0 state UP group default qlen 1000
    link/ether 52:54:00:1f:a4:e1 brd ff:ff:ff:ff:ff:ff
5: br0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 96:54:c1:c3:91:7b brd ff:ff:ff:ff:ff:ff
    inet 192.168.122.101/24 brd 192.168.122.255 scope global br0
       valid_lft forever preferred_lft forever

Thanks for your assistance.

That looks perfectly fine as a configuration.

You’ll likely want to do some tcpdump runs to try and see why the servers aren’t establishing the Geneve tunnels correctly.

Will do. So it sounds like this isn’t an issue with Incus but is something related to OVN not creating the required tunnels to work properly. I’ll investigate it from that direction.

Thanks for the assistance.

Yeah, exactly, those tunnels should be live regardless of what kind of configuration may have been done in Incus, so the fact that they’re not is quite suspicious and would explain much of the current behavior.