OVN high availability cluster tutorial

I’m making infrastructure where every user gets his own isolated subnet. The number of users can grow indefinitely.
You guys are the best btw :heart:

1 Like

ok, manually setting subnet address worked!

lxc network create ovn0 --type=ovn network=lxdbr0 ipv4.address=10.0.0.1/24 ipv4.nat=true
lxc network create ovn1 --type=ovn network=lxdbr0 ipv4.address=10.0.0.1/24 ipv4.nat=true

1 Like

There is now also a YouTube tutorial that runs through this topic. It might be good to mention it at the top somewhere.

https://youtu.be/1M__Rm9iZb8

Thanks. Please note the OVN_CTL_OPTS in that video are slightly incorrect, which will results in multiple standalone clusters rather than the desired single OVN cluster. So be sure to follow the OVN_CTL_OPTS settings in this video.

Or these steps are now also in our official documentation:

I have followed this guide to the letter and I am not able to get the container to ping outbound. I am able to ping two containers that are setup across two different lxd cluster vm instances.

I am not able to ping the ovn router from the lxd host that contains the new v1, v2, v3 vms. How can I troubleshoot this problem?

I figured out my issue. Docker and its firewall rules had been installed and caused my issue. I had been working on ceph and its install via docker.

1 Like

Great! See also How to configure your firewall - LXD documentation

Hello,

running into the same problem as @rocket reported. I followed all the steps listed above and can ping between instances on the new own network. However, external ping fails.

My base system is a RedHat 8 no firewall nor docker or similar installed, just LXD 5.3 and OVN 22.06.

I traced it down using tcpdump -i lxdbr0 -nn on the chassis host where is can see the following:

05:40:02.491889 IP 10.179.176.11 > 8.8.8.8: ICMP echo request, id 137, seq 6, length 64
05:40:02.493614 IP 8.8.8.8 > 10.179.176.11: ICMP echo reply, id 137, seq 6, length 64
05:40:02.564081 ARP, Request who-has 10.179.176.11 tell 10.179.176.1, length 28
05:40:02.564379 ARP, Reply 10.179.176.11 is-at 00:16:3e:70:07:68, length 28
05:40:03.516146 IP 10.179.176.11 > 8.8.8.8: ICMP echo request, id 137, seq 7, length 64
05:40:03.517916 IP 8.8.8.8 > 10.179.176.11: ICMP echo reply, id 137, seq 7, length 64
05:40:04.539968 IP 10.179.176.11 > 8.8.8.8: ICMP echo request, id 137, seq 8, length 64
05:40:04.541690 IP 8.8.8.8 > 10.179.176.11: ICMP echo reply, id 137, seq 8, length 64

so it seems like packages leaving the OVN → lxdbr0 network and there is a response but it doesn’t hit the container.
Performing the same with a container on the host where the chassis is located it looks like this:

05:49:19.428839 IP 10.41.109.4 > 8.8.8.8: ICMP echo request, id 133, seq 13, length 64
05:49:20.452426 IP 10.41.109.4 > 8.8.8.8: ICMP echo request, id 133, seq 14, length 64
05:49:21.476426 IP 10.41.109.4 > 8.8.8.8: ICMP echo request, id 133, seq 15, length 64
05:49:22.500506 IP 10.41.109.4 > 8.8.8.8: ICMP echo request, id 133, seq 16, length 64
05:49:23.524437 IP 10.41.109.4 > 8.8.8.8: ICMP echo request, id 133, seq 17, length 64
05:49:24.548510 IP 10.41.109.4 > 8.8.8.8: ICMP echo request, id 133, seq 18, length 64
0

The OVN config is the following:

lxc network show ovn0
config:
  bridge.mtu: "1442"
  ipv4.address: 10.41.109.1/24
  ipv4.nat: "true"
  ipv6.address: none
  network: lxdbr0
  volatile.network.ipv4.address: 10.179.176.11
description: ""
name: ovn0
type: ovn
used_by:
- /1.0/instances/c1
- /1.0/instances/c2
- /1.0/instances/c3
managed: true
status: Created
locations:
- test-10
- test-11
- test-12

Uplink config:

lxc network show lxdbr0 --target=test-12
config:
  ipv4.address: 10.179.176.1/24
  ipv4.dhcp.ranges: 10.179.176.5-10.179.176.10
  ipv4.nat: "true"
  ipv4.ovn.ranges: 10.179.176.11-10.179.176.20
  ipv6.address: none
description: ""
name: lxdbr0
type: bridge
used_by:
- /1.0/networks/ovn0
managed: true
status: Created
locations:
- test-10
- test-11
- test-12

Any instance created on lxdbr0 work as expected.

Would appreciate any pointers for further debugging.

Quick update,

it started to work after a system upgrade and full reboot. May be related to a newer kernel or some other network libraries…

Sometimes it is just like this.

1 Like

@tomp would it make any sense to have the lxd client warn if it detect docker and the user has to ack the warning? The warning could have a link to the firewalling page and the possilble networking issues an unsuspecting user might encounter.

@tomp

sudo ovs-vsctl set open_vswitch . \
    external_ids:ovn-encap-type=geneve \
    external_ids:ovn-remote="unix:/var/run/ovn/ovnsb_db.sock" \
    external_ids:ovn-encap-ip=$(ip r get 10.98.30.1 | grep -v cache | awk '{print $5}')

In this tutorial the unix socket for ovn-remote is specified.
However, in the " How to set up OVN with LXD" the 3 tcp definitions on port 6642 are used.

Is there a difference?
What if best to use on my three cluster members?

You must use the remote tcp definitions when using a cluster.

1 Like

Good tutorial, thank you so much. I think everything is working as expected. My issue is just the understanding of what i can do with an OVN. i read this: “OVN will put users in control over cloud network resources, by allowing users to connect groups of VMs or containers into private L2 and L3”.
Well, can anyone do a practical example of what is possible to do in very very poor words? I’m using Linux since the year 2009 and i’m still a perfect noob in understanding damned networks! :slight_smile: But since OVN switches and routes for me, could be something really good!