Multi DC LXD cluster

How do you handle LXD clusters across multiple data centers?

We find that due to latency issues, running a single cluster across multiple DCs has reliability issues.

Do you work around that in some way, or do you run a cluster / DC?

If so, how do you handle networking across the DCs?

We rely on containers in all datacenters being resolvable between all other DCs (Eg, under the .lxd domain).

Do you run some kind of network tunnel between the DCs? How do you get the DNS to then work?

I’m running one cluster per site but I’m also in the lucky situation of having my own public IP allocation and full control on routing (BGP) on the different sites, so it’s easy to allocate public IPs where I want them and the bulk of my internal traffic is also on public subnets (IPv6) with DC to DC traffic being secured through wireguard tunnels.

So in my case each site runs a LXD cluster along with an OVN deployment, I allocate LXD networks on OVN as needed and when I want something to be exposed outside of a LXD network, I assign it an externally reachable address.

For DNS, I started using the new network zone features so I can have all instances get forward and reverse DNS records on the automatically allocated addresses, the external addresses need manual DNS records though.

But that’s certainly not a trivial setup as it requires you do your site to site routing and tunneling on your own as well as run DNS infrastructure yourself, if only to import the DNS zones from the various LXD clusters and providing resolution for everything else.

1 Like