Can't make OVN network forward working in cluster environment

Hi there,

I´ve setup a three node cluster with LXD and OVN mainly by following the video from @stgraber and the tutorial from @tomp (OVN high availability cluster tutorial).

This workded quite well so LXD as well as OVN are clustered and I’ve network connectivity across the hosts to the outside (and between containers of course).

However I need to route incoming traffic to the containers in the network as well. I’ve tried to extract and adapt the steps from lxc-ci/test-lxd-network-ovn at master · lxc/lxc-ci · GitHub. In addition I’ve read [LXD] Floating IP addresses and Difference between network forward and proxy device but I could not find out how to make it work.

Please find some configuration info below. Thank you in advance for any hint!

OVN info

# First node
# ovs-vsctl show
2a32893c-0068-4055-a75c-1156bf19ccda
    Bridge lxdovn1
        Port lxdovn1b
            Interface lxdovn1b
        Port patch-lxd-net2-ls-ext-lsp-provider-to-br-int
            Interface patch-lxd-net2-ls-ext-lsp-provider-to-br-int
                type: patch
                options: {peer=patch-br-int-to-lxd-net2-ls-ext-lsp-provider}
        Port lxdovn1
            Interface lxdovn1
                type: internal
    Bridge br-int
        fail_mode: secure
        datapath_type: system
        Port vethcf9752c1
            Interface vethcf9752c1
        Port ovn-6325bc-0
            Interface ovn-6325bc-0
                type: geneve
                options: {csum="true", key=flow, remote_ip="xxxx:yyyy:242:4851::2"}
                bfd_status: {diagnostic="No Diagnostic", flap_count="1", forwarding="true", remote_diagnostic="No Diagnostic", remote_state=up, state=up}
        Port br-int
            Interface br-int
                type: internal
        Port ovn-d07c66-0
            Interface ovn-d07c66-0
                type: geneve
                options: {csum="true", key=flow, remote_ip="xxxx:yyyy:231:4b16::2"}
                bfd_status: {diagnostic="No Diagnostic", flap_count="1", forwarding="true", remote_diagnostic="No Diagnostic", remote_state=up, state=up}
        Port patch-br-int-to-lxd-net2-ls-ext-lsp-provider
            Interface patch-br-int-to-lxd-net2-ls-ext-lsp-provider
                type: patch
                options: {peer=patch-lxd-net2-ls-ext-lsp-provider-to-br-int}
# ovn-nbctl show
switch bb62da15-ad75-4212-9024-94edd0e81ae3 (lxd-net2-ls-int)
    port lxd-net2-ls-int-lsp-router
        type: router
        router-port: lxd-net2-lr-lrp-int
    port lxd-net2-instance-c073595b-d04b-4e8c-83f3-5fa49bad4f3f-eth0
        addresses: ["00:16:3e:f6:dd:6b dynamic"]
    port lxd-net2-instance-3b68d1a9-b990-4dd5-9911-3dd9fef46279-eth0
        addresses: ["00:16:3e:54:0b:c7 dynamic"]
    port lxd-net2-instance-8206872b-e23f-4f66-8303-e5511f1347c6-eth0
        addresses: ["00:16:3e:bb:18:73 dynamic"]
switch 1ff1793b-9080-477a-b33a-b66bb7bd5bb3 (lxd-net2-ls-ext)
    port lxd-net2-ls-ext-lsp-router
        type: router
        router-port: lxd-net2-lr-lrp-ext
    port lxd-net2-ls-ext-lsp-provider
        type: localnet
        addresses: ["unknown"]
router c48adb25-2fab-4c41-9d94-32c14b0eda38 (lxd-net2-lr)
    port lxd-net2-lr-lrp-int
        mac: "00:16:3e:6f:18:21"
        networks: ["10.168.61.1/24", "fd42:2d80:5c48:7993::1/64"]
    port lxd-net2-lr-lrp-ext
        mac: "00:16:3e:6f:18:21"
        networks: ["172.17.2.100/24", "fd42:a17b:8317:bd8b:216:3eff:fe6f:1821/64"]
    nat 021a766f-e5f9-493a-ab96-6f59dc8f18b8
        external ip: "172.17.2.100"
        logical ip: "10.168.61.0/24"
        type: "snat"
    nat 4d9a6b2d-8793-42b5-be58-daa5e5415d51
        external ip: "fd42:a17b:8317:bd8b:216:3eff:fe6f:1821"
        logical ip: "fd42:2d80:5c48:7993::/64"
        type: "snat"
# ovn-sbctl show
Chassis "6325bcf0-d4b6-40ba-9e78-a9de40672d4d"
    hostname: srv0011.cloud.zzz.de
    Encap geneve
        ip: "xxx:yyyy:242:4851::2"
        options: {csum="true"}
    Port_Binding lxd-net2-instance-8206872b-e23f-4f66-8303-e5511f1347c6-eth0
Chassis "d07c664c-c939-4923-b5d2-3014973eec00"
    hostname: srv0012.cloud.zzz.de
    Encap geneve
        ip: "xxx:yyyy:231:4b16::2"
        options: {csum="true"}
    Port_Binding cr-lxd-net2-lr-lrp-ext
    Port_Binding lxd-net2-instance-3b68d1a9-b990-4dd5-9911-3dd9fef46279-eth0
Chassis "2a32893c-0068-4055-a75c-1156bf19ccda"
    hostname: srv0010.cloud.zzz.de
    Encap geneve
        ip: "xxx:yyyy:252:1a50::2"
        options: {csum="true"}
    Port_Binding lxd-net2-instance-c073595b-d04b-4e8c-83f3-5fa49bad4f3f-eth0
# ovn-nbctl list load_balancer
# I've tried different scenarios after trying the failover ip did not work)
# - internal ovn ip
# - external host ip
# - external failover ip (<- this should work at the end)
_uuid               : 413d97b0-9f8e-473e-b579-0f453e048abb
external_ids        : {}
health_check        : []
ip_port_mappings    : {}
name                : lxd-net2-lb-172.17.2.1-tcp
options             : {}
protocol            : tcp
selection_fields    : []
vips                : {"172.17.2.1:80"="10.168.61.3:80"}

_uuid               : 17375113-421d-40c3-9355-5223ec9c995d
external_ids        : {}
health_check        : []
ip_port_mappings    : {}
name                : lxd-net2-lb-157.90.213.62-tcp
options             : {}
protocol            : tcp
selection_fields    : []
vips                : {"aaa.90.213.62:80"="10.168.61.3:80"}

_uuid               : 638cb75b-e411-44d0-ae90-30f636dce284
external_ids        : {}
health_check        : []
ip_port_mappings    : {}
name                : lxd-net2-lb-144.76.20.84-tcp
options             : {}
protocol            : tcp
selection_fields    : []
vips                : {"bbb.76.20.84:80"="10.168.61.3:80"}

LXD info

# lxc list
+------+---------+--------------------+-----------------------------------------------+-----------+-----------+-----------------------------+
| NAME |  STATE  |        IPV4        |                     IPV6                      |   TYPE    | SNAPSHOTS |          LOCATION           |
+------+---------+--------------------+-----------------------------------------------+-----------+-----------+-----------------------------+
| c10  | RUNNING | 10.168.61.2 (eth0) | fd42:2d80:5c48:7993:216:3eff:fef6:dd6b (eth0) | CONTAINER | 0         | srv0010.cloud.zzz.de |
+------+---------+--------------------+-----------------------------------------------+-----------+-----------+-----------------------------+
| c11  | RUNNING | 10.168.61.3 (eth0) | fd42:2d80:5c48:7993:216:3eff:febb:1873 (eth0) | CONTAINER | 0         | srv0011.cloud.zzz.de |
+------+---------+--------------------+-----------------------------------------------+-----------+-----------+-----------------------------+
| c12  | RUNNING | 10.168.61.4 (eth0) | fd42:2d80:5c48:7993:216:3eff:fe54:bc7 (eth0)  | CONTAINER | 0         | srv0012.cloud.zzz.de |
+------+---------+--------------------+-----------------------------------------------+-----------+-----------+-----------------------------+
# lxc network list
+-----------+----------+---------+----------------+---------------------------+-------------+---------+---------+
|   NAME    |   TYPE   | MANAGED |      IPV4      |           IPV6            | DESCRIPTION | USED BY |  STATE  |
+-----------+----------+---------+----------------+---------------------------+-------------+---------+---------+
| br-int    | bridge   | NO      |                |                           |             | 0       |         |
+-----------+----------+---------+----------------+---------------------------+-------------+---------+---------+
| dmz       | ovn      | YES     | 10.168.61.1/24 | fd42:2d80:5c48:7993::1/64 |             | 3       | CREATED |
+-----------+----------+---------+----------------+---------------------------+-------------+---------+---------+
| eth0      | physical | NO      |                |                           |             | 0       |         |
+-----------+----------+---------+----------------+---------------------------+-------------+---------+---------+
| eth0.4000 | vlan     | NO      |                |                           |             | 0       |         |
+-----------+----------+---------+----------------+---------------------------+-------------+---------+---------+
| lxdbr0    | bridge   | YES     | 172.17.2.1/24  | fd42:a17b:8317:bd8b::1/64 |             | 1       | CREATED |
+-----------+----------+---------+----------------+---------------------------+-------------+---------+---------+
| lxdovn1   | bridge   | NO      |                |                           |             | 0       |         |
+-----------+----------+---------+----------------+---------------------------+-------------+---------+---------+

General Info

# ip a (first node)
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether vv:ww:59:8c:bf:db brd ff:ff:ff:ff:ff:ff
    altname enp9s0
    inet aaa.90.213.62/32 scope global eth0
       valid_lft forever preferred_lft forever
    inet bbb.76.20.84/32 scope global eth0
       valid_lft forever preferred_lft forever
    inet aaa.90.213.62 peer aaa.90.213.1/32 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 xxxx:yyyy:252:1a50::2/128 scope global 
       valid_lft forever preferred_lft forever
    inet6 fe80::aaa1:59ff:fe8c:bfdb/64 scope link 
       valid_lft forever preferred_lft forever
3: eth0.4000@eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1400 qdisc noqueue state UP group default qlen 1000
    link/ether vv:ww:59:8c:bf:db brd ff:ff:ff:ff:ff:ff
    inet6 fd1e:7e4b:e0b4:10:150f:db8b:f620:2/64 scope global 
       valid_lft forever preferred_lft forever
    inet6 fe80::aaa1:59ff:fe8c:bfdb/64 scope link 
       valid_lft forever preferred_lft forever
4: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 02:5f:3a:69:d5:ef brd ff:ff:ff:ff:ff:ff
5: br-int: <BROADCAST,MULTICAST> mtu 1422 qdisc noop state DOWN group default qlen 1000
    link/ether c6:d1:22:13:69:dc brd ff:ff:ff:ff:ff:ff
6: wgl0: <POINTOPOINT,NOARP,UP,LOWER_UP> mtu 1320 qdisc noqueue state UNKNOWN group default qlen 1000
    link/none 
    inet6 fd1e:7e4b:e0b4:100:150f:db8b:f620:2/128 scope global 
       valid_lft forever preferred_lft forever
8: genev_sys_6081: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 65000 qdisc noqueue master ovs-system state UNKNOWN group default qlen 1000
    link/ether 4e:20:10:ac:d4:9a brd ff:ff:ff:ff:ff:ff
    inet6 fe80::cc63:63ff:fe17:65bc/64 scope link 
       valid_lft forever preferred_lft forever
9: lxdbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 00:16:3e:2d:71:a7 brd ff:ff:ff:ff:ff:ff
    inet 172.17.2.1/24 scope global lxdbr0
       valid_lft forever preferred_lft forever
    inet6 fd42:a17b:8317:bd8b::1/64 scope global 
       valid_lft forever preferred_lft forever
    inet6 fe80::216:3eff:fe2d:71a7/64 scope link 
       valid_lft forever preferred_lft forever
10: lxdovn1b@lxdovn1a: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master ovs-system state UP group default qlen 1000
    link/ether 72:76:ca:6b:36:8c brd ff:ff:ff:ff:ff:ff
11: lxdovn1a@lxdovn1b: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master lxdbr0 state UP group default qlen 1000
    link/ether 7a:fc:1e:9a:d6:e7 brd ff:ff:ff:ff:ff:ff
12: lxdovn1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 8e:7b:a7:5c:eb:49 brd ff:ff:ff:ff:ff:ff
14: vethcf9752c1@if13: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1422 qdisc noqueue master ovs-system state UP group default qlen 1000
    link/ether 46:d2:e5:f0:5c:e6 brd ff:ff:ff:ff:ff:ff link-netnsid 0
# ip route
default via aaa.90.213.1 dev eth0 proto static 
bbb.76.20.84 dev lxdbr0 scope link 
aaa.90.213.1 dev eth0 proto kernel scope link src 157.90.213.62 
172.17.2.0/24 dev lxdbr0 proto kernel scope link src 172.17.2.1 
172.17.2.1 dev lxdbr0 proto static scope link 

Please can you show the output of lxc network show lxdbr0 and lxc network show dmz?

Also, it is important to understand that in an OVN cluster each of the LXD servers acts as a (potential) OVN gateway chassis, meaning that depending on the availability of the other servers, one of them will be chosen to be connected to the outside world via the uplink network.

Based on your earlier comments it sounds like you’re using lxdbr0 as the uplink network, which is fine in small standalone deployments. But for clustered deployments, especially where you want to use inbound routing or network forwards, you need each of the LXD OVN chassis to be connected to the same uplink network so that they can failover and keep the same MAC address on the same uplink network.

Hi @tomp ,

thanks for your answer! I’ve understood that one of the servers is selected as gateway to the outside world. This seems to work. Running curl ifconfig.me in all three containers shows the same external ip address of the currently selected node. Switching this node off results in selection of another node and another ip returned by the above command. So this seems to work fine.

It’s true that lxdbr0 is meant for the uplink network. I did not any special configuration on that interfaces an learned from you, that normal linux kernel routing is responsible for getting the traffic coming from lxdbr0 to eth0 and to the outside world.

Regarding the uplink: we have a small cluster with currently three nodes only. They are all bare metal from the same hoster but not in the same switch. Therefore they do not have a common uplink but different gateways each.
However we have a failover ip ( bbb.76.20.84) which can be assigned to any of the three nodes. Currently this ip is assigned to srv0010.cloud.zzz.de. Is there any chance to realise network forwards this way?

I’ve used iptables rules in my old setup and was in the hope that things might be easier with ovn and the new forward feature now.

Thank you for your effort!

Below is the output of the commands:

# lxc network show lxdbr0
config:
  ipv4.address: 172.17.2.1/24
  ipv4.dhcp.ranges: 172.17.2.10-172.17.2.99
  ipv4.nat: "true"
  ipv4.ovn.ranges: 172.17.2.100-172.17.2.200
  ipv4.routes: 172.17.2.1/32
  ipv6.address: fd42:a17b:8317:bd8b::1/64
  ipv6.nat: "true"
description: ""
name: lxdbr0
type: bridge
used_by:
- /1.0/networks/dmz
managed: true
status: Created
locations:
- srv0010.cloud.zzz.de
- srv0011.cloud.zzz.de
- srv0012.cloud.zzz.de

# lxc network show dmz
config:
  bridge.mtu: "1422"
  ipv4.address: 10.168.61.1/24
  ipv4.nat: "true"
  ipv6.address: fd42:2d80:5c48:7993::1/64
  ipv6.nat: "true"
  network: lxdbr0
  volatile.network.ipv4.address: 172.17.2.100
  volatile.network.ipv6.address: fd42:a17b:8317:bd8b:216:3eff:fe6f:1821
description: ""
name: dmz
type: ovn
used_by:
- /1.0/instances/c10
- /1.0/instances/c11
- /1.0/instances/c12
managed: true
status: Created
locations:
- srv0011.cloud.zzz.de
- srv0012.cloud.zzz.de
- srv0010.cloud.zzz.de

If you can get the external IP routed/bound on the active LXD server and then add a static route on the host that will route the external IP to 172.17.2.100 (which is te IP of the OVN virtual router on the lxdbr0 network) then it should work.

You should be able to ascertain the active LXD OVN chassis server from lxc network info dmz.

Hmm, but how to define that static route? I surely can’t change the default route otherwise I would block the outgoing traffic right?
Do I need an iptables forward rule?

In addition I’m not able to see the active chassis from lxc network info dmz:

Name: dmz
MAC address: 00:16:3e:6f:18:21
MTU: 1422
State: up

Ips:
  inet	10.168.61.1
  inet6	fd42:2d80:5c48:7993::1

Network usage:
  Bytes received: 0B
  Bytes sent: 0B
  Packets received: 0
  Packets sent: 0

And is there any possibility to see the OVN router IP (172.17.2.100) via lxc command?

Last question: if I add additional networks like dmz: they all share the same UPLINK and router ip (172.17.2.100) do they?

Ah yes it was added quite recently:

So when setting up an OVN network forward, it will create an ARP resonder on the OVN virtual router’s external port (the one that connects to the LXD host’s lxdbr0 bridge).

But in order to do that OVN requires that its uplink network has been delegated “ownership” of that address via its ipv{n}.routes setting.

So in your case your LXD host’s lxdbr0 network has the subnet 172.17.2.0/24, and the OVN virtual router’s external port that connects to it has been given the IP 172.17.2.100. As such from the LXD host you should be able to ping 172.17.2.100.

However in order to add a network forward the the OVN network you need to ensure that the IP(s) you want for the forwards are in the lxdbr0’s ipv4.routes setting.

So in your case lets imagine your external IP is 192.0.2.1 you could delegate that to the lxdbr0 network using lxc network set lxdbr0 ipv4.routes=192.0.2.1/32.

This would then create a local route on each LXD server that sends traffic destined for 192.0.2.1 into the lxdbr0 network. At this stage nothing will respond to it.

Now you should be able to create a network forward on the OVN network, with a default target address of your choosing, e.g.:

lxc network forward create dmz 192.0.2.1 target_address=<an IP inside your ovn network>

Now you should be able to ping 192.0.2.1 from the LXD host that has the active chassis.

If you then add an ARP proxy responder on your external network interface (using ip neigh add proxy) or do what ever is needed by your ISP to get traffic to your external IP 192.0.2.1 to that LXD server then the local routing table should take over and forward it into the OVN network.

Unfortunately when you set lxc network set lxdbr0 ipv4.routes=192.0.2.1/32 this will apply on all of your LXD servers, and because only a single server can have the active chassis at any one time, it will mean that 192.0.2.1 will be unreachable locally on the other servers until they become active.

You may also consider using the proxy instance device, this would not be highly available (as its only bound on the host where an instance lives), but you could setup multiple instances each with proxy device that listens on the local LXD server for when that external IP becomes active and then forwards port(s) on that IP into the local container.

Hi @tomp ,

thanks again for you help. I’ve needed some time to understand your suggestions and testing them. Unfortunately I did not get it to work.

I think my first mistake was, that I had assigned my external failover ip ( bbb.76.20.84) to my nic eth0. I think in this case the no packet to this ip is ever routed to my bridge interface lxdbr0 but directly handled by the kernel on that interface, regardless of any route.

Therefore I’ve removed the ip address from the interface eth0. In addition I’ve made your steps above:

  • calling lxc network set lxdbr0 ipv4.routes=bbb.76.20.84/32. Afterwards ip route shows the route on each host: bbb.76.20.84 dev lxdbr0 proto static scope link
  • calling lxc forward create dmz bbb.76.20.84 target_address=10.168.61.4. Afterwards I get
    # lxc network forward show dmz bbb.76.20.84
    description: ""
    config:
      target_address: 10.161.64.4
    ports: []
    listen_address: bbb.76.20.84
    location: ""
    

But for some reason I’m not able to ping the external address on the host of the active chassis (srv0012 in my case).

# ping bbb.76.20.84
PING bbb.76.20.84 (bbb.76.20.84) 56(84) bytes of data.
From 172.17.2.1 icmp_seq=1 Destination Host Unreachable
^C
--- bbb.76.20.84 ping statistics ---
4 packets transmitted, 0 received, +1 errors, 100% packet loss, time 3031ms

(172.17.2.1 is assigned to lxdbr0). If I understood you correctly this should be possible at this point.

Also, trying to reach any service inside my container c12 (10.161.64.4) fails. Of course, if I assign my external ip to lxdbr0, I’m able to ping (should be answered by the hosts kernel right?) but still not able to reach any service.

I think that routing the traffic from outside to the host works as expected, based on my tests regarding your ip neigh add proxy suggestions:
After reading some articles I thought I’d understood the purpose of it. My expectation was, that assigning the external ip to lxdbr0 without calling ip neigh .... will not be sufficient to reach any service on that host over that ip from outside. For some reason it worked even without the ip neigh ... call on my hosts. Maybe hetzner is doing some ARP stuff on their routers for that failover ips in any case.

In the end, outside traffic still does not reach the ovn network or the container. I’ve ensured that the traffic to the external ip reaches the host with the active chassis. I’ve called ip neigh add proxy bbb.76.20.84 dev eth0 to be sure. I’ve tried to assign the external ip to lxdbr0. Nothing did help. Since pinging on the host itself does not work, I think there is some local problem (pinging 172.17.2.1 and 172.17.2.100 works)

Any suggestion would be helpful!

Did you miss doing this bit?

No, did that:

#ip neigh show proxy
bbb.76.20.84 dev eth0 proxy 

If I assign the ip to lxdbr and start a simple python server on it (python -m http.server) I can reach it from outside. So traffic is reaching the host.

However, running it inside a container does not work (regardless if it is assigned to lxdbr0or not.

Any hint why ping bbb.76.20.84 on that host does not work if it is not assigned to lxdbr0. I would expect that ovn should answer the ping?

Can I check if OVN is accepting the packets correctly?

Please show output of the following on the single host you’re trying to get working:

  • ip a
  • ip r
  • lxc network show lxdbr0
  • lxc network show <ovn network>
  • lxc network forward ls <ovn network>
# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether a8:a1:59:2f:ec:ec brd ff:ff:ff:ff:ff:ff
    altname enp35s0
    inet aaa.202.131.169/32 scope global eth0
       valid_lft forever preferred_lft forever
    inet aaa.202.131.169 peer 116.202.131.129/32 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 yyyy:4f8:231:4b16::2/128 scope global 
       valid_lft forever preferred_lft forever
    inet6 fe80::aaa1:59ff:fe2f:ecec/64 scope link 
       valid_lft forever preferred_lft forever
3: ovnbr0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 66:03:cf:2e:9f:2a brd ff:ff:ff:ff:ff:ff
4: eth0.4000@eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1400 qdisc noqueue state UP group default qlen 1000
    link/ether a8:a1:59:2f:ec:ec brd ff:ff:ff:ff:ff:ff
    inet6 fd1e:7e4b:e0b4:10:6432:67c:4f6f:2/64 scope global 
       valid_lft forever preferred_lft forever
    inet6 fe80::aaa1:59ff:fe2f:ecec/64 scope link 
       valid_lft forever preferred_lft forever
5: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 6e:e1:05:fa:61:ca brd ff:ff:ff:ff:ff:ff
6: br-int: <BROADCAST,MULTICAST> mtu 1422 qdisc noop state DOWN group default qlen 1000
    link/ether e6:77:1f:05:f7:80 brd ff:ff:ff:ff:ff:ff
7: lxdovn10: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 1e:0b:c8:21:20:4f brd ff:ff:ff:ff:ff:ff
8: genev_sys_6081: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 65000 qdisc noqueue master ovs-system state UNKNOWN group default qlen 1000
    link/ether 3e:c9:00:35:81:57 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::ec90:a5ff:fea9:254/64 scope link 
       valid_lft forever preferred_lft forever
9: wgl0: <POINTOPOINT,NOARP,UP,LOWER_UP> mtu 1320 qdisc noqueue state UNKNOWN group default qlen 1000
    link/none 
    inet6 fd1e:7e4b:e0b4:100:6432:67c:4f6f:2/128 scope global 
       valid_lft forever preferred_lft forever
10: lxdbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 00:16:3e:66:ac:93 brd ff:ff:ff:ff:ff:ff
    inet 172.17.2.1/24 scope global lxdbr0
       valid_lft forever preferred_lft forever
    inet6 fd42:8002:1cd8:1bb::1/64 scope global 
       valid_lft forever preferred_lft forever
    inet6 fe80::216:3eff:fe66:ac93/64 scope link 
       valid_lft forever preferred_lft forever
11: lxdovn10b@lxdovn10a: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master ovs-system state UP group default qlen 1000
    link/ether 6a:16:ba:5b:21:a7 brd ff:ff:ff:ff:ff:ff
12: lxdovn10a@lxdovn10b: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master lxdbr0 state UP group default qlen 1000
    link/ether 76:5b:cc:9b:2c:d0 brd ff:ff:ff:ff:ff:ff
14: veth072c34ac@if13: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1422 qdisc noqueue master ovs-system state UP group default qlen 1000
    link/ether e2:2e:69:0a:44:c8 brd ff:ff:ff:ff:ff:ff link-netnsid 0

(eth0.4000 and wgl0 are VLAN and wireguard interfaces and are not used for ovn or lxd)

# ip r
default via aaa.202.131.129 dev eth0 proto static 
aaa.202.131.129 dev eth0 proto kernel scope link src aaa.202.131.169 
bbb.76.20.84 dev lxdbr0 proto static scope link 
172.17.2.0/24 dev lxdbr0 proto kernel scope link src 172.17.2.1 
# lxc network show lxdbr0
config:
  ipv4.address: 172.17.2.1/24
  ipv4.dhcp.ranges: 172.17.2.10-172.17.2.99
  ipv4.nat: "true"
  ipv4.ovn.ranges: 172.17.2.100-172.17.2.199
  ipv4.routes: bbb.76.20.84/32
  ipv6.address: fd42:8002:1cd8:1bb::1/64
  ipv6.nat: "true"
description: ""
name: lxdbr0
type: bridge
used_by:
- /1.0/networks/dmz
managed: true
status: Created
locations:
- srv0010.cloud.zzz.de
- srv0011.cloud.zzz.de
- srv0012.cloud.zzz.de
# lxc network show dmz
config:
  bridge.mtu: "1422"
  ipv4.address: 10.161.64.1/24
  ipv4.nat: "true"
  ipv6.address: fd42:4:3591:1695::1/64
  ipv6.nat: "true"
  network: lxdbr0
  volatile.network.ipv4.address: 172.17.2.100
  volatile.network.ipv6.address: fd42:8002:1cd8:1bb:216:3eff:fe3e:2519
description: ""
name: dmz
type: ovn
used_by:
- /1.0/instances/c10
- /1.0/instances/c11
- /1.0/instances/c12
managed: true
status: Created
locations:
- srv0010.cloud.zzz.de
- srv0011.cloud.zzz.de
- srv0012.cloud.zzz.de
# lxc network forward ls dmz
+----------------+-------------+------------------------+-------+----------+
| LISTEN ADDRESS | DESCRIPTION | DEFAULT TARGET ADDRESS | PORTS | LOCATION |
+----------------+-------------+------------------------+-------+----------+
| bbb.76.20.84   |             | 10.161.64.4            | 0     |          |
+----------------+-------------+------------------------+-------+----------+

Can you ping 172.17.2.100 the OVN logical router’s IP on the uplink network, this will confirm the chassis is active as an gateway.

Yes that works:

# ping 172.17.2.100
PING 172.17.2.100 (172.17.2.100) 56(84) bytes of data.
64 bytes from 172.17.2.100: icmp_seq=1 ttl=254 time=0.249 ms
64 bytes from 172.17.2.100: icmp_seq=2 ttl=254 time=0.211 ms
^C
--- 172.17.2.100 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1003ms
rtt min/avg/max/mdev = 0.211/0.230/0.249/0.019 ms

(it does not work on the other nodes)

In addition, lxc exec c12 curl ifconfig.me gives aaa.202.131.169 which is the external ip of srv0012.

Does it work if you (temporarily) replace the route that LXD adds with a route directly to the OVN router’s IP on the uplink network, e.g.

sudo ip r replace bbb.76.20.84/32 via 172.17.2.100 dev lxdbr0

In my tests the OVN logical router seems to only intermittently respond to ARP queries for the load balancer addresses.

Hi @tomp ,

this works!!!

OK cool, I’m seeing if there is an option in OVN to make this more reliable.

How should it work in theory?

If I create a second network (e.g. lxc network create ep --type=ovn network=lxdbr0), this second network gets the ip 172.17.2.101.
So ipv4.ovn.ranges defines the range from which ovn takes addresses for the seperate networks?
All of these networks are “linked” together via 172.17.2.1 so normally (without replacing the route) I should be able to forward traffic to different lxc networks if they are created as stated above, correct?

What I don’t see is the usage of ipv4.dhcp.ranges in this setup?

And now, for some reason, ovn does not correctly respond to the ARP queries therefore the host kernel does not route the traffic to OVN (since he does not know that OVN is the correct destination for the failover ip).

Can you show the output of ovn-nbctl --version please?

I’m seeing the issue on my own host but only if using Ubuntu Jammy as the host OS (with its bundled OVN and OVS packages).

You can find the definition of config keys in the LXD documentation.

ipv4.ovn.ranges: Comma separate list of IPv4 ranges to use for child OVN network routers (FIRST-LAST format)

So each OVN network’s virtual logical router will be assigned an IP from that range to use on the uplink network (lxdbr0 in this case). This is why you can ping the volatile.network.ipv4.address address.

Then when you create a network forward then the OVN logical router should also respond to ARP/NDP requests for that IP from the uplink network. And it does, at least on Ubuntu Focal. But it looks like its stopped working on more recent versions.

Sure:

# ovn-nbctl --version
ovn-nbctl 21.12.0
Open vSwitch Library 2.16.90
DB Schema 5.34.1