How to Setup LXD Cluster with LXC network bridge and tunnels

Essentially I have 2 vms. They both have identical lxc managed bridges (using lxc network create).

I have successfully run lxd init on one host selecting existing bridge but on the other when I try to join, I keep getting connection refused. I’ll try to drop as much info as I can.

on micro-vm-2

microcloud@micro-vm-2:~$ snap list
Name    Version         Rev    Tracking       Publisher   Notes
core22  20240408        1380   latest/stable  canonical✓  base
lxd     5.21.1-d46c406  28460  5.21/stable    canonical✓  -
snapd   2.63            21759  latest/stable  canonical✓  snapd

microcloud@micro-vm-2:~$ lxd init
Would you like to use LXD clustering? (yes/no) [default=no]: yes
What IP address or DNS name should be used to reach this server? [default=10.0.0.6]: 192.168.0.7
Are you joining an existing cluster? (yes/no) [default=no]: no
What member name should be used to identify this server in the cluster? [default=micro-vm-2]:
Do you want to configure a new local storage pool? (yes/no) [default=yes]:
Name of the storage backend to use (lvm, zfs, btrfs, dir) [default=zfs]:
Create a new ZFS pool? (yes/no) [default=yes]:
Would you like to use an existing empty block device (e.g. a disk or partition)? (yes/no) [default=no]:
Size in GiB of the new loop device (1GiB minimum) [default=5GiB]:
Do you want to configure a new remote storage pool? (yes/no) [default=no]:
Would you like to connect to a MAAS server? (yes/no) [default=no]:
Would you like to configure LXD to use an existing bridge or host interface? (yes/no) [default=no]: yes
Name of the existing bridge or host interface: mcbr_bridge
Would you like stale cached images to be updated automatically? (yes/no) [default=yes]:
Would you like a YAML "lxd init" preseed to be printed? (yes/no) [default=no]:

microcloud@micro-vm-2:~$ lxc network show mcbr_bridge # details of the managed bridge.
name: mcbr_bridge
description: ""
type: bridge
managed: true
status: Created
config:
  ipv4.address: 192.168.0.7/16
  ipv4.nat: "true"
  ipv6.address: fd42:b9f9:d423:14bb::1/64
  ipv6.nat: "true"
  tunnel.012.id: "012"
  tunnel.012.interface: eth0
  tunnel.012.local: 10.0.0.6
  tunnel.012.protocol: vxlan
  tunnel.012.remote: 10.0.0.5
  tunnel.112.id: "112"
  tunnel.112.interface: eth0
  tunnel.112.local: 10.0.0.6
  tunnel.112.protocol: vxlan
  tunnel.112.remote: 10.0.0.4
used_by:
- /1.0/profiles/default
locations:
- micro-vm-2

microcloud@micro-vm-2:~$ lxc cluster list
+------------+--------------------------+-----------------+--------------+----------------+-------------+--------+-------------------+
|    NAME    |           URL            |      ROLES      | ARCHITECTURE | FAILURE DOMAIN | DESCRIPTION | STATE  |      MESSAGE      |
+------------+--------------------------+-----------------+--------------+----------------+-------------+--------+-------------------+
| micro-vm-2 | https://192.168.0.7:8443 | database-leader | x86_64       | default        |             | ONLINE | Fully operational |
|            |                          | database        |              |                |             |        |                   |
+------------+--------------------------+-----------------+--------------+----------------+-------------+--------+-------------------+

For firewall,

microcloud@micro-vm-2:~$ systemctl status ufw
Unit ufw.service could not be found.
microcloud@micro-vm-2:~$ sudo iptables -L -n -v
sudo: iptables: command not found
microcloud@micro-vm-2:~$ sudo nft list ruleset
sudo: nft: command not found
microcloud@micro-vm-2:~$ lxc network set mcbr_bridge ipv6.firewall false
lxc network set mcbr_bridge ipv4.firewall false

# token to add micro-vm-1
microcloud@micro-vm-2:~$  lxc cluster add micro-vm-1
Member micro-vm-1 join token:
eyJzZXJ2ZXJfbmFtZSI6Im1pY3JvLXZtLTEixxxxxx

For these vms, I deployed a slimmed down version of 24.04 (Noble Numbat) on Azure to eliminate routing/firewall guessing. Although the result is exactly the same for 22.04.4 LTS (Jammy Jellyfish) on digital ocean which had ufw, iptables and nft. Disabling them or adding allow rules had no effect either.

on micro-vm-1

microcloud@micro-vm-1:~$ snap list
Name    Version         Rev    Tracking       Publisher   Notes
core22  20240408        1380   latest/stable  canonical✓  base
lxd     5.21.1-d46c406  28460  5.21/stable    canonical✓  -
snapd   2.63            21759  latest/stable  canonical✓  snapd

microcloud@micro-vm-1:~$ # performed a curl to test connectivity
microcloud@micro-vm-1:~$ curl -X POST --insecure https://192.168.0.7:8443/internal/cluster/accept
{"type":"error","status":"","status_code":0,"operation":"","error_code":403,"error":"not authorized","metadata":null}

microcloud@micro-vm-1:~$ sudo lxd init
Would you like to use LXD clustering? (yes/no) [default=no]: yes
What IP address or DNS name should be used to reach this server? [default=10.0.0.4]: 192.168.0.6
Are you joining an existing cluster? (yes/no) [default=no]: yes
Do you have a join token? (yes/no/[token]) [default=no]: yes
Please provide join token: eyJzZXJ2ZXJfbmFtZSI6Im1pY3JvLXZtxxxxxxxx
All existing data is lost when joining a cluster, continue? (yes/no) [default=no] yes
Choose "size" property for storage pool "local":
Choose "source" property for storage pool "local":
Choose "zfs.pool_name" property for storage pool "local":
Would you like a YAML "lxd init" preseed to be printed? (yes/no) [default=no]:
Error: Failed to join cluster: Failed request to add member: Post "https://192.168.0.7:8443/internal/cluster/accept": Unable to connect to: 192.168.0.7:8443 ([dial tcp 192.168.0.7:8443: connect: connection refused])

microcloud@micro-vm-1:~$ lxc network show mcbr_bridge
If this is your first time running LXD on this machine, you should also run: lxd init
To start your first container, try: lxc launch ubuntu:22.04
Or for a virtual machine: lxc launch ubuntu:22.04 --vm

name: mcbr_bridge
description: ""
type: bridge
managed: true
status: Created
config:
  ipv4.address: 192.168.0.6/16
  ipv4.nat: "true"
  ipv6.address: fd42:d423:14bb:8d0c::1/64
  ipv6.nat: "true"
  tunnel.011.id: "011"
  tunnel.011.interface: eth0
  tunnel.011.local: 10.0.0.4
  tunnel.011.protocol: vxlan
  tunnel.011.remote: 10.0.0.5
  tunnel.112.id: "112"
  tunnel.112.interface: eth0
  tunnel.112.local: 10.0.0.4
  tunnel.112.protocol: vxlan
  tunnel.112.remote: 10.0.0.6
used_by: []
locations:
- none

for firewall

microcloud@micro-vm-1:~$ systemctl status ufw
Unit ufw.service could not be found.
microcloud@micro-vm-1:~$ sudo iptables -L -n -v
sudo: iptables: command not found
microcloud@micro-vm-1:~$ sudo nft list ruleset
sudo: nft: command not found

I can see identical encrypted traffic on both hosts during the join session when I run the below commands but nothing indicates to me why traffic was dropped.

tcpdump --immediate-mode -nn -i mcbr_bridge -nn dst port 8443
# or
sudo tcpflow -c -e http dst port 8443 -i mcbr_bridge

FYI: Cluster join works completely fine when the bridge is created using brctl addbr but I specifically need the bridge created by lxc as it seems to work well with mDNS discovery in microcloud.

I’d really appreciate suggestions and solutions.