Can't make OVN network forward working in cluster environment

No problem! Thanks for your great help @tomp! I was quite frustrated in between, because I still don’t understand enough about networks and I am very happy that it works now.
At least I have now learned a few things again :smiley:

I’m going to a log an issue at GitHub - ovn-org/ovn: Open Virtual Network now

1 Like
1 Like

Yes that is correct. Network forward rules work at the IP level so you can ‘float’ a forward between instances just by changing the instance’s IP.

Can you see if using this commit fixes it:

As this is what we are wanting the OVN router to do, respond to ARP requests for load balancer IPs outside of the router’s own subnets.

This thread was about network forwarding in a setup described in OVN high availability cluster tutorial.

After the description of the tutorial above you have a clustered environment with working traffic to the outside world based on ovn.

My challenge was to make working ingress traffic with forwarding to containers as well. Precondition for this to work is a single ip address for all cluster members to the outside. In my case its a failover ip that I can switch between my single hosts.

Let’s say this external ip is 44.28.196.49. Be sure that you haven’t configured this ip to any of your hosts interfaces.
Then you have to configure a route for your bridge via lxc network set lxdbr0 ipv4.routes=44.28.196.49/32. This way every traffic to the failover ip on your host will be routed to your bridge interface.
Now you can create a network forward according to https://linuxcontainers.org/lxd/docs/master/network-forwards/: lxc network forward create ovn0 44.28.196.49 target_address=<container/vm ip>.

If you are going with a failover ip like me, be sure that the failover ip is routed to the host with the active chassis of ovn. In the upcoming release you might see the active chassis via lxc network info ovn0 . Otherwise check the result of running curl ifconfig.me in one of your containers to check which host is active. Configure at your hoster that the failover ip is routed to that host.

Important note: It seems that there is a bug in ovn 21.06.0 upwards (see Load balancer ARP responder broken since 21.06 · Issue #124 · ovn-org/ovn · GitHub). Therefore you have to use version 21.03.0 or earlier. You may check the version via ovn-nbctl --version.

1 Like

What do you exactly mean? The commit is contained in 21.09 and following and these versions do not work.

Applying it to 21.03 gives merge conflicts.

Ah OK thanks so 21.09 onwards don’t work either, thats good to know.

21.06. does not work and 21.12. does not work. I did not yet made a test with 21.09 but I can do that as well.

But I would expect that it will not work as well.

1 Like

Does not work for 21.09.1 as well:

# ovn-sbctl list logical_flow | grep bbb.76.20.84 
match               : "ct.est && ip4 && reg0 == bbb.76.20.84 && ct_label.natted == 1 && is_chassis_resident(\"cr-lxd-net11-lr-lrp-ext\")"
actions             : "reg0 = bbb.76.20.84; ct_dnat;"
match               : "ip && ip4.dst == bbb.76.20.84"
match               : "ct.new && ip4 && reg0 == bbb.76.20.84 && is_chassis_resident(\"cr-lxd-net11-lr-lrp-ext\")"
actions             : "reg1 = bbb.76.20.84; ct_lb(backends=10.161.64.2);"
match               : "ct.new && ip4.dst == bbb.76.20.84"
1 Like

Reproducer steps here:

1 Like

I’ve tried the proposed patch and it is working for me.

It would be interesting how to integrate the option to LXD. In the meantime: will LXD ever overwrite loadbalancer options?

We may add a setting in the future, but I suspect we will leave it disabled by default for performance reasons.

Right now lxd shouldn’t mess with that setting though.

Hi @tomp ,

I have to get back to you (or other people of course) since I’ve found one thing that is still not working. I thought adding it here might be helpful since the complete setup is described here.

In short: clustered environment with three hosts, ovn network, failover ip and LXD network forwards. Exactly one host is choosen as chassis by OVN as gateway to the outside world.

I now have problems by reaching one container from another container via the external ip. It should be solvable by hairpining or split dns. However split dns seems to be to complex for me since internal dns/dhcp is done by ovn.

From googling around I’ve got the feeling that OVN should support hairpinning in some scenarios. I’ve found the issue Use OVN force SNAT for load balancers (and not forwards) · Issue #10654 · lxc/lxd · GitHub. I don’t understand it all but even if I tried the force option for SNAT it is not working.
I’ve also tried a simple iptables rule on the the chassis host but it doesn’t worked either.

I’m thinking this must be a common problem but I don’t find any solutions or notes what I’m doing wrong right now.

Thanks in adavance for any hint!

So to check I understand the issue, your container that is inside the same OVN network as the target of a network forward is trying to connect to the listen address of the network forward and its not working?

Please can you show the relevant network and forward configs please, along with examples of client commands that work and do not work.

If I am using jammy and ovn installed via packages, what is the workaround?

ovn-nbctl --version
ovn-nbctl 22.03.0
Open vSwitch Library 2.17.0
DB Schema 6.1.0

cat /etc/issue
Ubuntu 22.04.2 LTS \n \l

Are you experiencing the same issue as @lepokle ? I didn’t get a reply to my last message to confirm my understanding of the problem.

I believe my issue will be resolved with the routing work around I found in the nightly check scripts.

i found it by reading the threads of this issue and tracked down this resolution

Basically I am not able to ping inbound or ssh inbound using a forward
For this test I have a 3 node ovn cluster
and I have only setup a single lxd instance at this stage of testing to keep it simpler and not have to worry if the controller is on one of the other 2 nodes.

#!/bin/bash
OVN_IP1=‘10.2.0.2’
OVN_IP2=‘10.2.0.3’
OVN_IP3=‘10.2.0.4’
INTERFACE=‘ens192’
LXD_LEADER=lxd1

CURR_HOSTNAME=hostname -s
CURR_IP=ip -4 a show up dev ${INTERFACE}|grep inet|awk '{print $2}'|awk -F'/' '{print $1}'

if [ “${CURR_HOSTNAME}” = “${LXD_LEADER}” ]; then
lxc config get network.ovn.northbound_connection|grep -q tcp
if [ ! $? -eq 0 ]; then
lxc config set network.ovn.northbound_connection tcp:${OVN_IP1}:6641,tcp:${OVN_IP2}:6641,tcp:${OVN_IP3}:6641
fi
lxc config get network.ovn.northbound_connection
lxc network list
fi

lxc rm c1 -f
lxc rm c2 -f
lxc network delete ovn0
lxc network delete lxdbr0
lxc network create lxdbr0 --target=lxd1
lxc network create lxdbr0
ipv4.address=10.179.176.1/24
ipv4.nat=true
ipv4.dhcp.ranges=10.179.176.5-10.179.176.10
ipv4.ovn.ranges=10.179.176.11-10.179.176.20
ipv4.routes=64.X.X.69/32,64.X.X.70/32

lxc network create ovn0 --type=ovn network=lxdbr0

ovnIPv4="$(lxc network get ovn0 volatile.network.ipv4.address)"
ovnIPv6="$(lxc network get ovn0 volatile.network.ipv6.address)"

#lxc network create lxdbr0 --type=ovn network=UPLINK
lxc launch images:ubuntu/jammy/cloud c1 --network ovn0
lxc launch images:ubuntu/jammy/cloud c2 --network ovn0

lxc ls # get the target ip of the forward
lxc network forward create ovn0 64.X.X.69 target_address=“10.18.129.2”
ip r add 64.X.X.69/32 via “${ovnIPv4}” dev lxdbr0
ip r add 64.X.x.70/32 via “${ovnIPv4}” dev lxdbr0
lxc network show ovn1
lxc network show lxdbr0

I am not able to ping 64.X.X.69
I am not able to ssh to 64.X.X.69 after logging into the container and installing openssh
I am able to ping and ssh between c1/c2 from inside an lxc shell c1

So I figured out my issues. First I had a stale route that was messing things up.
Then I didnt setup my bridge correctly.
I had to specify the mac address of the upstream hardware interface on jammy before pinging from the bridge would work

/etc/netplan/00-installer-config.yaml
network:
  bridges:
    br0:
      interfaces: [ens256]
      macaddress: <mac of ens256 interface>
      addresses:
         - your ip cidr here
      nameservers:
         addresses:
         - 8.8.8.8
         - 1.1.1.1
      routes:
      - to: default
        via: <your gateway here>
      parameters:
        stp: true
        forward-delay: 4
  ethernets:
         ens256:
      dhcp4: no
version: 2
lxc network create UPLINK --type=physical parent=br0 --target=lxd1
lxc network create UPLINK --type=physical parent=br0 --target=lxd2
lxc network create UPLINK --type=physical parent=br0 --target=lxd3
lxc network create UPLINK --type=physical \
    ipv4.ovn.ranges=<first ip in uplink cidr space that you want for ovn>-<last ip> \
    ipv4.gateway=<uplink gateway cidr> \
    dns.nameservers=<comma sep list of nameservers> \
    ipv4.routes=<uplink network ip cidr /32 or /24>

lxc network create ovn0 --type=ovn network=UPLINK \
	ipv4.address=10.10.10.1/24 \
	ipv4.nat=true

lxc init images:ubuntu/jammy/cloud c1 --network ovn1
lxc config device set c1 eth0 ipv4.address 10.10.10.10
lxc start c1
lxc exec c1 -- apt install -y openssh-server
lxc network forward create ovn1 <free ip in uplink network> target_address="10.10.10.10"

the free ip can be the floating volatile ip
ovnIPv4="$(lxc network get ovn0 volatile.network.ipv4.address)"

or it could be a totally free ip in the route list specified earlier.

you should be able to ping the forward ip or use ssh to connect via that forwarded ip.

1 Like