Failure in Joining First Member node to Incus Cluster

Scott_T · May 8, 2024, 6:06pm

I have my bootstrap node up and running with:

incus cluster enable vmscloud-incus

I want to add an incus member server to the cluster. I executed the command below on the new member server providing the address of the server.

incus config set cluster.https_address 172.16.1.65:8443

I needed to clear the previously defined port 8443:

incus config set core.https_address=

Back on the bootstrap server:

scott@vmscloud-incus:~$ incus cluster add vmsfog-incus 
Member vmsfog-incus join token:
eyJzZXJ2ZXJfbmFtZSI6InZtc2ZvZy1pbmN1cyIsImZpbmdlcnByaW50IjoiN2E5MzU5YTQzYTQwZmI5MGFjNzU5MGQ4MGZiZWRjYTdkYzc5Y2QxMmY2Nzk5YzM1M2I4OWIwYWRlZWEwMDYwMyIsImFkZHJlc3NlcyI6WyIxNzIuMTYuMS4yMTk6ODQ0MyJdLCJzZWNyZXQiOiI5NmQxMzcwOGE2ZDU4NDBhNjczZDkwMTEzOWFjMThjYTJkOGExZDQ2ZjIzYzY4ZmYzOWI2ZWMzOTkyODJjMjgyIiwiZXhwaXJlc19hdCI6IjIwMjQtMDUtMDhUMjE6MDI6MDkuNTAyODUyODMxWiJ9

Now over on the standalone server that I want to be a member server:

scott@vmsfog-incus:~$ sudo incus admin init
Would you like to use clustering? (yes/no) [default=no]: yes
What IP address or DNS name should be used to reach this server? [default=172.16.1.65]: vmsfog-incus
Are you joining an existing cluster? (yes/no) [default=no]: yes
Please provide join token: eyJzZXJ2ZXJfbmFtZSI6InZtc2ZvZy1pbmN1cyIsImZpbmdlcnByaW50IjoiN2E5MzU5YTQzYTQwZmI5MGFjNzU5MGQ4MGZiZWRjYTdkYzc5Y2QxMmY2Nzk5YzM1M2I4OWIwYWRlZWEwMDYwMyIsImFkZHJlc3NlcyI6WyIxNzIuMTYuMS4yMTk6ODQ0MyJdLCJzZWNyZXQiOiIyZmRlNzJmMzk3ZDE1MjRjZWUyODUyZjU1MzYwYWJiYzZkNWZmYTM2M2NiOWRjYzY3NWMzZDgwNmQ5YzY4ZmI3IiwiZXhwaXJlc19hdCI6IjIwMjQtMDUtMDhUMjA6NDc6MzQuMjAxOTE0Njg5WiJ9
All existing data is lost when joining a cluster, continue? (yes/no) [default=no] yes
Choose "size" property for storage pool "default": 150GiB
Choose "source" property for storage pool "default": 
Choose "zfs.pool_name" property for storage pool "default": 
Would you like a YAML "init" preseed to be printed? (yes/no) [default=no]: 
Error: Failed to join cluster: Failed to initialize member: Failed to initialize storage pools and networks: Failed to update storage pool "default": Config key "size" is cluster member specific

It seems that no matter what I pick for the size & source values, I get this error. Any ideas?

FYI: All systems are running the same incus version.

Scott_T · May 9, 2024, 12:10am

I discovered the answer to this. If your incus server has EVER been a standalone server with containers, you need to stop all containers:

incus stop --all

You then need to delete all these containers.
Sorry, no wildcard I can find.

One thing about joining a cluster as a member (after the bootstrap cluster node):

sudo incus admin init

Although the script indicates that “all data will be deleted”, that’s not really true. It should say, “stop here and delete all data”, because it fails when all data is not deleted.

Not only do you need to delete your containers, but you must:

incus profile device rm default root
incus profile device rm default eth0
incus network rm incusbr0
incus storage rm default

You must also:

incus config set core.https_address=

and that’s because 8443 will be the default for cluster communication.
After that, define your cluster address:

incus config set cluster.https_address  a.b.c.d:8443

At this point, joining will actually work (note cluster operations require sudo):

sudo incus admin init

alain · May 9, 2024, 7:38am

I had some problems too with cluster members…
The member server to add was already a cluster itself (my bad). I didn’t find a way to disable cluster mode on it.
I had to remove it:
sudo systemctl stop incus.service incus.socket
sudo rm -r /var/lib/incus

#and then initialising the server again as a new member:
sudo systemctl start incus.socket incus.service
#and then:
( from the doc: “To join a server to the cluster, run incus admin init on the cluster. Joining an existing cluster requires root privileges, so make sure to run the command as root or with sudo.”)
sudo incus admin init
etc.

xarufagem · May 9, 2024, 9:34am

I bieleve it’s the actual way to do so, Alain

Tho, regarding storage volumes, depending on the storage backend, you may still have to delete them by hand

Not doing so, and trying to join a cluster, with exact same storage pool name that was used beforehand (like ‘defaut’ or ‘local’), could result in an error too.

Just my two cents