I’ve set up LXD clustering on an existing LXD node today with lxc cluster enable <name>, and then added a fresh node to it. However after rebooting all nodes something strange seems to have happened, leaving LXD in a partial state:
tobias@artemis:~$ lxc cluster list
Error: Server is not clustered
tobias@artemis:~$ lxc cluster enable artemis
Error: This LXD server is already clustered
I’m not entirely sure what is going on here, or how to get the cluster back up and running, hopefully someone here does.
Environment
LXD 4.22
uname -a: Linux artemis 5.4.0-97-generic #110-Ubuntu SMP Thu Jan 13 18:22:13 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
Hmm, can you try running systemctl reload snap.lxd.daemon to have LXD restart and see if that helps? If not, look at /var/snap/lxd/common/lxd/logs/lxd.log for some relevant errors.
Something that seems odd to me is the extra " appended at the end of the FQDN, though that might just be the logging. Running dig on the address returns the correct IP address too. I’m unsure on how to change it to the IP itself, to try and see if that helps any (10.10.2.4), as I’m unable to change the address, with LXD telling me it is not supported to change the address
After fiddling with Systemd for too much trying to get the LXD daemon to stay quiet, I’ve been able to get it working. Changing artemis.array21.dev to 10.10.2.2 made LXD realize it is in a cluster again.
Maybe but I’m not yet sure where the bug is, LXD was trying to resolve artemis.array21.dev which seems correct but that was failing for some reason.
It’s maybe snap related, in which case, can you try: nsenter --mount=/run/snapd/ns/lxd.mnt getent hosts artemis.array21.dev and see if that resolves properly too?