I am running a 6 node cluster. About a month ago, one of our nodes needed repairs and went offline. This was not an issue until today I noticed that my cluster was in a blocked state, following the latest snap update to LXD5.21 where the offline node was the only one still on 5.20. After reading the docs a bit closer (How to manage a cluster - LXD documentation), I now see I should have removed that node from the cluster.
I am unsure how to go about recovering my cluster, is it possible to somehow remove the offline node from the cluster without access to lxc cluster remove
as all the lxc commands hang?
Here are the nodes in question, (glf-science-5 is the offline one):
stoyelq@glf-science-1:~$ lxd sql global "SELECT id, name, schema, api_extensions, heartbeat, state, arch FROM nodes"
+----+---------------+--------+----------------+-------------------------------------+-------+------+
| id | name | schema | api_extensions | heartbeat | state | arch |
+----+---------------+--------+----------------+-------------------------------------+-------+------+
| 1 | glf-science-1 | 73 | 382 | 2024-04-12T14:36:44.902403155-03:00 | 0 | 2 |
| 2 | glf-science-3 | 73 | 382 | 2024-04-12T14:36:48.828686825-03:00 | 0 | 2 |
| 4 | glf-science-2 | 73 | 382 | 2024-04-12T14:36:48.483196507-03:00 | 0 | 2 |
| 6 | glf-science-0 | 73 | 382 | 2024-04-12T14:36:47.26775565-03:00 | 0 | 2 |
| 7 | glf-science-4 | 73 | 382 | 2024-04-12T14:36:45.948198612-03:00 | 0 | 2 |
| 8 | glf-science-5 | 69 | 370 | 2024-03-01T10:30:03.816786445-04:00 | 0 | 2 |
+----+---------------+--------+----------------+-------------------------------------+-------+------+
So far, I also tried to down grade all the nodes back to 5.20 to see if that would let me remove the node, but all lxc commands failed with
stoyelq@glf-science-3:~$ lxc ls
Error: Get "http://unix.socket/1.0": EOF
And the status of the daemon had a stack of messages indicating: Error: Failed to initialize global database: failed to ensure schema: this node's version is behind, please upgrade