Can't make OVN network forward working in cluster environment

Since lxd seems to reset the route after boot, I’ll go with the version 21.03.0 now.

One other question regarding forwards: since I can only configure container ip addresses as target (not the container name) the normal way to do is to stick the target container to a distinct ip, correct?

Thanks for this, its immensely valuable. :slight_smile:

No problem! Thanks for your great help @tomp! I was quite frustrated in between, because I still don’t understand enough about networks and I am very happy that it works now.
At least I have now learned a few things again :smiley:

I’m going to a log an issue at GitHub - ovn-org/ovn: Open Virtual Network now

1 Like
1 Like

Yes that is correct. Network forward rules work at the IP level so you can ‘float’ a forward between instances just by changing the instance’s IP.

Can you see if using this commit fixes it:

As this is what we are wanting the OVN router to do, respond to ARP requests for load balancer IPs outside of the router’s own subnets.

This thread was about network forwarding in a setup described in OVN high availability cluster tutorial.

After the description of the tutorial above you have a clustered environment with working traffic to the outside world based on ovn.

My challenge was to make working ingress traffic with forwarding to containers as well. Precondition for this to work is a single ip address for all cluster members to the outside. In my case its a failover ip that I can switch between my single hosts.

Let’s say this external ip is 44.28.196.49. Be sure that you haven’t configured this ip to any of your hosts interfaces.
Then you have to configure a route for your bridge via lxc network set lxdbr0 ipv4.routes=44.28.196.49/32. This way every traffic to the failover ip on your host will be routed to your bridge interface.
Now you can create a network forward according to https://linuxcontainers.org/lxd/docs/master/network-forwards/: lxc network forward create ovn0 44.28.196.49 target_address=<container/vm ip>.

If you are going with a failover ip like me, be sure that the failover ip is routed to the host with the active chassis of ovn. In the upcoming release you might see the active chassis via lxc network info ovn0 . Otherwise check the result of running curl ifconfig.me in one of your containers to check which host is active. Configure at your hoster that the failover ip is routed to that host.

Important note: It seems that there is a bug in ovn 21.06.0 upwards (see Load balancer ARP responder broken since 21.06 · Issue #124 · ovn-org/ovn · GitHub). Therefore you have to use version 21.03.0 or earlier. You may check the version via ovn-nbctl --version.

1 Like

What do you exactly mean? The commit is contained in 21.09 and following and these versions do not work.

Applying it to 21.03 gives merge conflicts.

Ah OK thanks so 21.09 onwards don’t work either, thats good to know.

21.06. does not work and 21.12. does not work. I did not yet made a test with 21.09 but I can do that as well.

But I would expect that it will not work as well.

1 Like

Does not work for 21.09.1 as well:

# ovn-sbctl list logical_flow | grep bbb.76.20.84 
match               : "ct.est && ip4 && reg0 == bbb.76.20.84 && ct_label.natted == 1 && is_chassis_resident(\"cr-lxd-net11-lr-lrp-ext\")"
actions             : "reg0 = bbb.76.20.84; ct_dnat;"
match               : "ip && ip4.dst == bbb.76.20.84"
match               : "ct.new && ip4 && reg0 == bbb.76.20.84 && is_chassis_resident(\"cr-lxd-net11-lr-lrp-ext\")"
actions             : "reg1 = bbb.76.20.84; ct_lb(backends=10.161.64.2);"
match               : "ct.new && ip4.dst == bbb.76.20.84"
1 Like

Reproducer steps here:

1 Like

I’ve tried the proposed patch and it is working for me.

It would be interesting how to integrate the option to LXD. In the meantime: will LXD ever overwrite loadbalancer options?

We may add a setting in the future, but I suspect we will leave it disabled by default for performance reasons.

Right now lxd shouldn’t mess with that setting though.

Hi @tomp ,

I have to get back to you (or other people of course) since I’ve found one thing that is still not working. I thought adding it here might be helpful since the complete setup is described here.

In short: clustered environment with three hosts, ovn network, failover ip and LXD network forwards. Exactly one host is choosen as chassis by OVN as gateway to the outside world.

I now have problems by reaching one container from another container via the external ip. It should be solvable by hairpining or split dns. However split dns seems to be to complex for me since internal dns/dhcp is done by ovn.

From googling around I’ve got the feeling that OVN should support hairpinning in some scenarios. I’ve found the issue Use OVN force SNAT for load balancers (and not forwards) · Issue #10654 · lxc/lxd · GitHub. I don’t understand it all but even if I tried the force option for SNAT it is not working.
I’ve also tried a simple iptables rule on the the chassis host but it doesn’t worked either.

I’m thinking this must be a common problem but I don’t find any solutions or notes what I’m doing wrong right now.

Thanks in adavance for any hint!

So to check I understand the issue, your container that is inside the same OVN network as the target of a network forward is trying to connect to the listen address of the network forward and its not working?

Please can you show the relevant network and forward configs please, along with examples of client commands that work and do not work.

If I am using jammy and ovn installed via packages, what is the workaround?

ovn-nbctl --version
ovn-nbctl 22.03.0
Open vSwitch Library 2.17.0
DB Schema 6.1.0

cat /etc/issue
Ubuntu 22.04.2 LTS \n \l

Are you experiencing the same issue as @lepokle ? I didn’t get a reply to my last message to confirm my understanding of the problem.