Failed to join cluster due to missing "config" part in "lxc profile show"

I have the first node of a cluster setup a few weeks ago. Now I’m trying to add a new node. The storage backend is LVM. The lxd init ended with this message.

Error: Failed to join cluster: Failed request to add member: Mismatching config for storage pool local: different values for keys: lvm.thinpool_name

Despite the failure the storage pool was created, and the config shows:

$ sudo lxc storage show local
config:
  lvm.thinpool_name: LXDThinPool
  lvm.vg_name: local
  source: local
  volatile.initial_source: local
description: ""
name: local
driver: lvm
used_by: []
status: Created
locations:
- none

However, on the first cluster node the config is empty.

$ sudo lxc storage show local
config: {}
description: ""
name: local
driver: lvm
used_by:
- /1.0/images/4e15a9bde9a8d5d5e96b722c32b047f78aa0bd686a2755b2d428bd665c6a37de
- /1.0/instances/artifac
- /1.0/instances/gitlab-ci-01
- /1.0/instances/gitlab-ci-03
- /1.0/instances/jenkins-master1
- /1.0/instances/jenkins-slave001
- /1.0/profiles/bigvol_pub
- /1.0/profiles/default
- /1.0/profiles/default_pub
status: Created
locations:
- ijssel

The name of the thinpool on the first node is LXDThinPool, exactly as expected. When I try to set it anyway, there is an error of the underlying lvrename.

$ sudo lxc storage set local lvm.thinpool_name=LXDThinPool
Error: Error renaming LVM thin pool from "LXDThinPool" to "LXDThinPool": Failed to run: lvrename local LXDThinPool LXDThinPool: Old and new logical volume names must differ
  Run `lvrename --help' for more information.

What version of LXD is that?

The first node runs snap lxd 4.3 (on Ubuntu 20.04)
My new node I installed fresh today with Ubuntu 20.04, and it was running lxd snap 4.0.1
In the mean time I have upgrade to lxd snap 4.3, but I haven’t retried. First I need to clear before I can do lxd init again.

Ok, you indeed need identical versions on both nodes, though I would have expected a clearer error if that’s the issue.

Isn’t it the problem that there is no “config” on the first node?

Try lxc storage show local --target ijssel as those config keys are node-specific.

On my new node (luts) that gives

$ sudo lxc storage show local --target ijssel
Error: No cluster member called 'ijssel'

Which is understandable, because the lxd init failed.
On ijssel that command gives

lxc storage show local --target ijssel
config:
  lvm.vg_name: local
  source: /var/snap/lxd/common/lxd/disks/local.img
description: ""
name: local
driver: lvm
used_by:
- /1.0/images/4e15a9bde9a8d5d5e96b722c32b047f78aa0bd686a2755b2d428bd665c6a37de
- /1.0/instances/artifac
- /1.0/instances/gitlab-ci-01
- /1.0/instances/gitlab-ci-03
- /1.0/instances/jenkins-master1
- /1.0/instances/jenkins-slave001
- /1.0/profiles/bigvol_pub
- /1.0/profiles/default
- /1.0/profiles/default_pub
status: Created
locations:
- ijssel

Retried lxd init with snap lxd 4.3 on both nodes. This gives the same error.

Error: Failed to join cluster: Failed request to add member: Mismatching config for storage pool local: different values for keys: lvm.thinpool_name

@tomp can you take a look at that one?

@stgraber certainly.

So first thing to check is whether LXD 4.3 allows a fresh LVM based cluster to be spun up. For this I’m using LXD’s own VM support to run a cluster of 3 Ubuntu Focal machines, with each node having access to a 5GB block device from the host, which will be the basis for the LVM storage pool.

First, lets create the sparse block files on the LXD host:

truncate -s 5G /home/user/cluster-v1.img
truncate -s 5G /home/user/cluster-v2.img
truncate -s 5G /home/user/cluster-v3.img

Now lets create the VMs for our cluster and add the extra disk to them:

lxc init images:ubuntu/focal cluster-v1 --vm
lxc init images:ubuntu/focal cluster-v2 --vm
lxc init images:ubuntu/focal cluster-v3 --vm
lxc config device add cluster-v1 lvm disk source=/home/user/cluster-v1.img
lxc config device add cluster-v2 lvm disk source=/home/user/cluster-v2.img
lxc config device add cluster-v3 lvm disk source=/home/user/cluster-v3.img

Now lets start the VMs:

lxc start cluster-v1 cluster-v2 cluster-v3

Wait for them to boot:

lxc ls
+------------+---------+------------------------+------------------------------------------------+-----------------+-----------+
|    NAME    |  STATE  |          IPV4          |                      IPV6                      |      TYPE       | SNAPSHOTS |
+------------+---------+------------------------+------------------------------------------------+-----------------+-----------+
| cluster-v1 | RUNNING | 10.109.89.59 (enp5s0)  | fd42:d37c:f0f2:a5f:216:3eff:fe4e:652b (enp5s0) | VIRTUAL-MACHINE | 0         |
+------------+---------+------------------------+------------------------------------------------+-----------------+-----------+
| cluster-v2 | RUNNING | 10.109.89.234 (enp5s0) | fd42:d37c:f0f2:a5f:216:3eff:fe64:7eff (enp5s0) | VIRTUAL-MACHINE | 0         |
+------------+---------+------------------------+------------------------------------------------+-----------------+-----------+
| cluster-v3 | RUNNING | 10.109.89.119 (enp5s0) | fd42:d37c:f0f2:a5f:216:3eff:fe95:9664 (enp5s0) | VIRTUAL-MACHINE | 0         |
+------------+---------+------------------------+------------------------------------------------+-----------------+-----------+

Now for each node lets install LXD 4.3:

lxc exec cluster-v1 -- apt install snapd lvm2 -y
lxc exec cluster-v1 -- snap install lxd
lxc exec cluster-v2 -- apt install snapd lvm2 -y
lxc exec cluster-v2 -- snap install lxd
lxc exec cluster-v3 -- apt install snapd lvm2 -y
lxc exec cluster-v3 -- snap install lxd

Lets create the initial cluster node on cluster-v1, and specifying the /dev/sdb device as the source of the new LVM storage pool:

lxc shell cluster-v1
lxd init
Would you like to use LXD clustering? (yes/no) [default=no]: yes
What name should be used to identify this node in the cluster? [default=cluster-v1]: 
What IP address or DNS name should be used to reach this node? [default=10.109.89.59]: 
Are you joining an existing cluster? (yes/no) [default=no]: 
Setup password authentication on the cluster? (yes/no) [default=yes]: 
Trust password for new clients: 
Again: 
Do you want to configure a new local storage pool? (yes/no) [default=yes]: 
Name of the storage backend to use (btrfs, dir, lvm, zfs) [default=zfs]: lvm
Create a new LVM pool? (yes/no) [default=yes]: 
Would you like to use an existing empty disk or partition? (yes/no) [default=no]: yes
Path to the existing block device: /dev/sdb
Do you want to configure a new remote storage pool? (yes/no) [default=no]: 
Would you like to connect to a MAAS server? (yes/no) [default=no]: 
Would you like to configure LXD to use an existing bridge or host interface? (yes/no) [default=no]: 
Would you like to create a new Fan overlay network? (yes/no) [default=yes]: 
What subnet should be used as the Fan underlay? [default=auto]: 
Would you like stale cached images to be updated automatically? (yes/no) [default=yes] 
Would you like a YAML "lxd init" preseed to be printed? (yes/no) [default=no]: 

Lets check the storage pool has been created:

lvs
  LV          VG    Attr       LSize  Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  LXDThinPool local twi-a-tz-- <3.00g             0.00   1.57                
lxc storage show local
config:
  lvm.thinpool_name: LXDThinPool
description: ""
name: local
driver: lvm
used_by:
- /1.0/profiles/default
status: Created
locations:
- cluster-v1

OK now on cluster-v2 and cluster-v3 we join them to cluster-v1 and specify the local device for the LVM pool as /dev/sdb:

lxc shell cluster-v2
lxd init
Would you like to use LXD clustering? (yes/no) [default=no]: yes
What name should be used to identify this node in the cluster? [default=cluster-v2]: 
What IP address or DNS name should be used to reach this node? [default=10.109.89.234]: 
Are you joining an existing cluster? (yes/no) [default=no]: yes
IP address or FQDN of an existing cluster node: 10.109.89.59
Cluster fingerprint: 38d0d144013f895413372fcce550a1d4aa99a7d518274a91e1a9eae0b205216e
You can validate this fingerprint by running "lxc info" locally on an existing node.
Is this the correct fingerprint? (yes/no) [default=no]: yes
Cluster trust password: 
All existing data is lost when joining a cluster, continue? (yes/no) [default=no] yes
Choose "lvm.vg_name" property for storage pool "local": 
Choose "source" property for storage pool "local": /dev/sdb
Would you like a YAML "lxd init" preseed to be printed? (yes/no) [default=no]: 

Again, lets check each node’s LVM config:

lvs
  LV          VG    Attr       LSize  Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  LXDThinPool local twi-a-tz-- <3.00g             0.00   1.57    
lxc storage show local --target=cluster-v2
To start your first instance, try: lxc launch ubuntu:18.04

config:
  lvm.thinpool_name: LXDThinPool
  lvm.vg_name: local
  source: local
  volatile.initial_source: /dev/sdb
description: ""
name: local
driver: lvm
used_by:
- /1.0/profiles/default
status: Created
locations:
- cluster-v1
- cluster-v2
- cluster-v3

And to make sure its working, lets launch a container:

lxc shell cluster-v1
lxc launch images:alpine/3.12 c1
lvs
  LV                                                                      VG    Attr       LSize  Pool        Origin                                                                  Data%  Meta%  Move Log Cpy%Sync Convert
  LXDThinPool                                                             local twi-aotz-- <3.00g                                                                                     7.45   1.59                            
  containers_c1                                                           local Vwi-aotz-k <9.32g LXDThinPool images_641c9cb5e352408e2bfb3005f7f830dabe86e8d8b6abbad308fdcfb4cf8242f8 2.38                                   
  images_641c9cb5e352408e2bfb3005f7f830dabe86e8d8b6abbad308fdcfb4cf8242f8 local Vwi---tz-k <9.32g LXDThinPool                               

So it appears that the basic LVM cluster functionality is working OK. We now need to figure out what is different in your configuration.

Please can you show me the output of:

lxd sql global 'select * from storage_pools_config'

On my example cluster it shows:

root@cluster-v1:~# lxd sql global 'select * from storage_pools_config'
+----+-----------------+---------+-------------------------+-------------+
| id | storage_pool_id | node_id |           key           |    value    |
+----+-----------------+---------+-------------------------+-------------+
| 7  | 2               | 1       | volatile.initial_source | /dev/sdb    |
| 8  | 2               | 1       | lvm.vg_name             | local       |
| 9  | 2               | <nil>   | lvm.thinpool_name       | LXDThinPool |
| 10 | 2               | 1       | source                  | local       |
| 11 | 2               | 2       | source                  | local       |
| 12 | 2               | 2       | volatile.initial_source | /dev/sdb    |
| 13 | 2               | 2       | lvm.vg_name             | local       |
| 14 | 2               | 3       | lvm.vg_name             | local       |
| 15 | 2               | 3       | source                  | local       |
| 16 | 2               | 3       | volatile.initial_source | /dev/sdb    |
+----+-----------------+---------+-------------------------+-------------+

Also please can you show the full answers you gave to the lxd init on the 2nd node please.

My first node is called ijssel

root@ijssel:~# lxd sql global 'select * from storage_pools_config'
+----+-----------------+---------+-------------+------------------------------------------+
| id | storage_pool_id | node_id |     key     |                  value                   |
+----+-----------------+---------+-------------+------------------------------------------+
| 3  | 1               | 1       | source      | /var/snap/lxd/common/lxd/disks/local.img |
| 4  | 1               | 1       | lvm.vg_name | local                                    |
+----+-----------------+---------+-------------+------------------------------------------+

Full answers of lxd init on the second node (called: luts), first node is called ijssel

root@luts:~# lxd init
Would you like to use LXD clustering? (yes/no) [default=no]: yes
What name should be used to identify this node in the cluster? [default=luts]: 
What IP address or DNS name should be used to reach this node? [default=172.16.16.45]: 
Are you joining an existing cluster? (yes/no) [default=no]: yes
IP address or FQDN of an existing cluster node: ijssel.ghs.nl
Cluster fingerprint: f3a7079038205003c4806208104f643ade069877304ac647019c3455320d92a6
You can validate this fingerprint by running "lxc info" locally on an existing node.
Is this the correct fingerprint? (yes/no) [default=no]: yes
Cluster trust password: 
All existing data is lost when joining a cluster, continue? (yes/no) [default=no] yes
Choose "lvm.vg_name" property for storage pool "local": 
Choose "source" property for storage pool "local": /dev/md1
Would you like a YAML "lxd init" preseed to be printed? (yes/no) [default=no]: yes
config: {}
networks: []
storage_pools: []
profiles: []
cluster:
  server_name: luts
  enabled: true
  member_config:
  - entity: storage-pool
    name: local
    key: lvm.vg_name
    value: ""
    description: '"lvm.vg_name" property for storage pool "local"'
  - entity: storage-pool
    name: local
    key: source
    value: /dev/md1
    description: '"source" property for storage pool "local"'
  cluster_address: ijssel.ghs.nl:8443
  cluster_certificate: |
    -----BEGIN CERTIFICATE-----
    MIIC...
    -----END CERTIFICATE-----
  server_address: 172.16.16.45:8443
  cluster_password: ...

Error: Failed to join cluster: Failed request to add member: Mismatching config for storage pool local: different values for keys: lvm.thinpool_name

So it looks like your first node is missing the lvm.thinpool_name setting.

I’ve just tried creating a fresh cluster using loopback images, and it created the lvm.thinpool_name setting as expected:

cluster-v1

root@cluster-v1:~# lxd init
Would you like to use LXD clustering? (yes/no) [default=no]: yes
What name should be used to identify this node in the cluster? [default=cluster-v1]: 
What IP address or DNS name should be used to reach this node? [default=10.109.89.20]: 
Are you joining an existing cluster? (yes/no) [default=no]: 
Setup password authentication on the cluster? (yes/no) [default=yes]: 
Trust password for new clients: 
Again: 
Do you want to configure a new local storage pool? (yes/no) [default=yes]: 
Name of the storage backend to use (lvm, zfs, btrfs, dir) [default=zfs]: lvm
Create a new LVM pool? (yes/no) [default=yes]: 
Would you like to use an existing empty disk or partition? (yes/no) [default=no]: 
Size in GB of the new loop device (1GB minimum) [default=5GB]: 
Do you want to configure a new remote storage pool? (yes/no) [default=no]: 
Would you like to connect to a MAAS server? (yes/no) [default=no]: 
Would you like to configure LXD to use an existing bridge or host interface? (yes/no) [default=no]: 
Would you like to create a new Fan overlay network? (yes/no) [default=yes]: 
What subnet should be used as the Fan underlay? [default=auto]: 
Would you like stale cached images to be updated automatically? (yes/no) [default=yes] 
Would you like a YAML "lxd init" preseed to be printed? (yes/no) [default=no]: 
root@cluster-v1:~# lvs
  LV          VG    Attr       LSize Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  LXDThinPool local twi-a-tz-- 2.65g             0.00   1.57      

Then on cluster-v2:

root@cluster-v2:~# lxd init 
Would you like to use LXD clustering? (yes/no) [default=no]: yes
What name should be used to identify this node in the cluster? [default=cluster-v2]: 
What IP address or DNS name should be used to reach this node? [default=10.109.89.225]: 
Are you joining an existing cluster? (yes/no) [default=no]: yes
IP address or FQDN of an existing cluster node: 10.109.89.20
Cluster fingerprint: 4f7cefc7b40d0d525d11cc6b05a30bcbb24ff3cd0564944fb270582fdaeffaae
You can validate this fingerprint by running "lxc info" locally on an existing node.
Is this the correct fingerprint? (yes/no) [default=no]: yes
Cluster trust password: 
All existing data is lost when joining a cluster, continue? (yes/no) [default=no] yes
Choose "lvm.vg_name" property for storage pool "local": 
Choose "size" property for storage pool "local": 
Choose "source" property for storage pool "local": 
Would you like a YAML "lxd init" preseed to be printed? (yes/no) [default=no]: 
root@cluster-v2:~# lvs
  LV          VG    Attr       LSize   Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  LXDThinPool local twi-a-tz-- <11.97g             0.00   1.58                            

And the resulting DB entries:

lxd sql global 'select * from storage_pools_config'
+----+-----------------+---------+-------------------+------------------------------------------+
| id | storage_pool_id | node_id |        key        |                  value                   |
+----+-----------------+---------+-------------------+------------------------------------------+
| 3  | 1               | 1       | size              | 5GB                                      |
| 4  | 1               | <nil>   | lvm.thinpool_name | LXDThinPool                              |
| 5  | 1               | 1       | source            | /var/snap/lxd/common/lxd/disks/local.img |
| 6  | 1               | 1       | lvm.vg_name       | local                                    |
| 7  | 1               | 2       | size              | 15GB                                     |
| 8  | 1               | 2       | source            | /var/snap/lxd/common/lxd/disks/local.img |
| 9  | 1               | 2       | lvm.vg_name       | local                                    |
+----+-----------------+---------+-------------------+------------------------------------------+

Shows the lvm.thinpool_name set as a non-node specific key.

Please can you show output of lvs on your first node please.

For your information, and hopefully not to confuse things. When I was installing ijssel I already had a server which I intended to use as the first of the cluster. However, that one is running Ubuntu 18.04 with an older LXD version. Joining ijssel to that cluster failed because of schema version mismatch: cluster has 7. ijssel is running Ubuntu 20.04 with LXD from snap. I gave up and decided to start a new cluster. Maybe I messed up in the process because I had to redo the installation a couple of times.

I’m also a bit confused why your 2nd node is using a dedicated block device /dev/md1 but your first node is using a local loopback image.

Its not wrong per-se, but loopback images are really only suitable for development, and with the 2nd node you’re trying to use /dev/md1 suggests not development purposes, so just want to flag up the first node is not using the same type of block device and won’t be as performant as the 2nd node.

lvs output on the first node

root@ijssel:~# lvs
  LV                                                                      VG    Attr       LSize   Pool        Origin Data%  Meta%  Move Log Cpy%Sync Convert
  LXDThinPool                                                             local twi-aotz--  <3,41t                    7,89   4,74                            
  containers_artifac                                                      local Vwi---tz-k  <9,32g LXDThinPool                                               
  containers_gitlab--ci--01                                               local Vwi-aotz-k <51,23g LXDThinPool        35,69                                  
  containers_gitlab--ci--03                                               local Vwi-aotz-k <18,63g LXDThinPool        37,62                                  
  containers_jenkins--master1                                             local Vwi-aotz-k <18,63g LXDThinPool        20,11                                  
  containers_jenkins--slave001                                            local Vwi-aotz-k <18,63g LXDThinPool        12,32                                  
  images_4e15a9bde9a8d5d5e96b722c32b047f78aa0bd686a2755b2d428bd665c6a37de local Vwi---tz-k  <9,32g LXDThinPool                                               
  home                                                                    vg0   -wi-ao----  50,00g                                                           
  root                                                                    vg0   -wi-ao---- 100,00g                                                           

OK let me try and craft a DB query to fix this.

I saw that too. I am not sure how or why that happened. For sure it was not intentional.