I’ve got a 5 node lab cluster, node1’s ip was used as cluster.https_address . When I had to do a little maintenance (add a HDD) I ran incus admin os system poweroff --target node1 while it was off I thought I would check some storage settings on the other nodes to cross check… much to my surprise I was not able to connect to the cluster (since that IP was not gone.)
It seems like maybe that’s not the right thing to do. Can I run keepalived on all the nodes to share a VIP or is that not an options since it’s IncusOS? I don’t have any external load balancers but I do have RoundRobin DNS so should I have my-cluster’s cluster.https_address be a dns record then add an A record for each node?
Ideally for production environments, we recommend running behind a load-balancer with backend connectivity checks so traffic is only sent to servers that are responsive.
But when that’s not an option, DNS round-robin is another viable option. Not all DNS servers handle round-robin correctly, some will keep a fixed order but that’s still usually good enough for this kind of thing.
Hello, (you’re) @stgraber !!! Big fan, thank you for your help, it worked a treat!
For anyone else this was super good enough for the lab.
Summary:
In my UDMPro’s DNS records/policy I added five new A records my-cluster.lan pointing to the IPs of each server
On my client I ran incus remote add my-cluster.lan then incus remote switch my-cluster.lan
I then powered off node1, did some incus ls and am getting a response each time!
At first I was a little confused and wondered if I needed to do something like incus cluster set cluster.https_address but realized each server has it’s own IP as that setting
21:36:21 ~ $ for i in {1..5}; do incus config get cluster.https_address --target node${i}; done
10.123.0.11:8443
10.123.0.12:8443
10.123.0.13:8443
10.123.0.14:8443
10.123.0.15:8443
In case anyone else is similarly confused.
And yes, the browser also works for https://my-cluster.lan:8443 if so inclined.