Setup an Incus linstor pool cluster on separate satellites failed to notify peer ERR 404

Hello,

I am trying to setup a LINSTOR pool for the incus 3-nodes cluster.

I followed the general information found in Linstor Incus page adapting the guide to my situation.

I have an INCUS cluster composed of 3 nodes which are set up to form a cluster of LINSTOR controller-satellite (COMBINED mode). Furthermore, I have configured three other LINSTOR satellites to serve as an Incus shared pool.

Here is my LINSTOR node list:

root@s31:~# linstor node list
╭────────────────────────────────────────────────────────╮
┊ Node ┊ NodeType  ┊ Addresses                  ┊ State  ┊
╞════════════════════════════════════════════════════════╡
┊ s31  ┊ COMBINED  ┊ 192.168.1.231:3366 (PLAIN) ┊ Online ┊
┊ s32  ┊ COMBINED  ┊ 192.168.1.232:3366 (PLAIN) ┊ Online ┊
┊ s33  ┊ COMBINED  ┊ 192.168.1.233:3366 (PLAIN) ┊ Online ┊
┊ s34  ┊ SATELLITE ┊ 192.168.1.234:3366 (PLAIN) ┊ Online ┊
┊ s35  ┊ SATELLITE ┊ 192.168.1.235:3366 (PLAIN) ┊ Online ┊
┊ s36  ┊ SATELLITE ┊ 192.168.1.236:3366 (PLAIN) ┊ Online ┊
╰────────────────────────────────────────────────────────╯

and their extended information:

root@s31:~$ linstor node info
╭────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
┊ Node ┊ Diskless ┊ LVM ┊ LVMThin ┊ ZFS/Thin ┊ File/Thin ┊ SPDK ┊ Remote SPDK ┊ Storage Spaces ┊ Storage Spaces/Thin ┊
╞════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╡
┊ s31  ┊ +        ┊ +   ┊ +       ┊ -        ┊ +         ┊ -    ┊ +           ┊ -              ┊ -                   ┊
┊ s32  ┊ +        ┊ +   ┊ +       ┊ -        ┊ +         ┊ -    ┊ +           ┊ -              ┊ -                   ┊
┊ s33  ┊ +        ┊ +   ┊ +       ┊ -        ┊ +         ┊ -    ┊ +           ┊ -              ┊ -                   ┊
┊ s34  ┊ +        ┊ +   ┊ +       ┊ +        ┊ +         ┊ -    ┊ +           ┊ -              ┊ -                   ┊
┊ s35  ┊ +        ┊ +   ┊ +       ┊ +        ┊ +         ┊ -    ┊ +           ┊ -              ┊ -                   ┊
┊ s36  ┊ +        ┊ +   ┊ +       ┊ +        ┊ +         ┊ -    ┊ +           ┊ -              ┊ -                   ┊
╰────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯

╭───────────────────────────────────────────────────────────────────╮
┊ Node ┊ DRBD ┊ LUKS ┊ NVMe ┊ Cache ┊ BCache ┊ WriteCache ┊ Storage ┊
╞═══════════════════════════════════════════════════════════════════╡
┊ s31  ┊ +    ┊ +    ┊ +    ┊ +     ┊ +      ┊ +          ┊ +       ┊
┊ s32  ┊ +    ┊ +    ┊ +    ┊ +     ┊ +      ┊ +          ┊ +       ┊
┊ s33  ┊ +    ┊ +    ┊ +    ┊ +     ┊ +      ┊ +          ┊ +       ┊
┊ s34  ┊ +    ┊ +    ┊ +    ┊ +     ┊ +      ┊ +          ┊ +       ┊
┊ s35  ┊ +    ┊ +    ┊ +    ┊ +     ┊ +      ┊ +          ┊ +       ┊
┊ s36  ┊ +    ┊ +    ┊ +    ┊ +     ┊ +      ┊ +          ┊ +       ┊
╰───────────────────────────────────────────────────────────────────╯

Then, I created the storage pools in each satellite node that will contribute storage to the cluster. I used LINSTOR to easily automate my pool:

root@s31:~$ linstor physical-storage create-device-pool --storage-pool incuspool --pool-name tank zfs s34 /dev/disk/by-vdev/p440ar_d1 /dev/disk/by-vdev/p440ar_d2
SUCCESS:
    (s34) ZPool 'tank' on device(s) [/dev/disk/by-vdev/p440ar_d1, /dev/disk/by-vdev/p440ar_d2] created.
SUCCESS:
    Successfully set property key(s): StorDriver/StorPoolName
SUCCESS:
Description:
    New storage pool 'incuspool' on node 's34' registered.
Details:
    Storage pool 'incuspool' on node 's34' UUID is: 4842fcd3-55f6-429d-9010-49e770421c3d
SUCCESS:
    (s34) Changes applied to storage pool 'incuspool' of node 's34'
SUCCESS:
    Storage pool updated on 's34'
root@s31:~$ linstor physical-storage create-device-pool --storage-pool incuspool --pool-name tank zfs s35 /dev/disk/by-vdev/p440ar_d1 /dev/disk/by-vdev/p440ar_d2
SUCCESS:
    (s35) ZPool 'tank' on device(s) [/dev/disk/by-vdev/p440ar_d1, /dev/disk/by-vdev/p440ar_d2] created.
SUCCESS:
    Successfully set property key(s): StorDriver/StorPoolName
SUCCESS:
Description:
    New storage pool 'incuspool' on node 's35' registered.
Details:
    Storage pool 'incuspool' on node 's35' UUID is: 99845419-daae-4af3-b234-7b9378db3bf6
SUCCESS:
    (s35) Changes applied to storage pool 'incuspool' of node 's35'
SUCCESS:
    Storage pool updated on 's35'
root@s31:~$ linstor physical-storage create-device-pool --storage-pool incuspool --pool-name tank zfs s36 /dev/disk/by-vdev/p440ar_d1 /dev/disk/by-vdev/p440ar_d2
SUCCESS:
    (s36) ZPool 'tank' on device(s) [/dev/disk/by-vdev/p440ar_d1, /dev/disk/by-vdev/p440ar_d2] created.
SUCCESS:
    Successfully set property key(s): StorDriver/StorPoolName
SUCCESS:
Description:
    New storage pool 'incuspool' on node 's36' registered.
Details:
    Storage pool 'incuspool' on node 's36' UUID is: 41119fc0-4b2b-4ea2-a287-99d4167010b2
SUCCESS:
    (s36) Changes applied to storage pool 'incuspool' of node 's36'
SUCCESS:
    Storage pool updated on 's36'
root@s31:~$ 

I verified that all storage pools are created and report the expected size:

root@s31:~$ linstor storage-pool list
╭────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
┊ StoragePool          ┊ Node ┊ Driver   ┊ PoolName ┊ FreeCapacity ┊ TotalCapacity ┊ CanSnapshots ┊ State ┊ SharedName               ┊
╞════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╡
┊ DfltDisklessStorPool ┊ s31  ┊ DISKLESS ┊          ┊              ┊               ┊ False        ┊ Ok    ┊ s31;DfltDisklessStorPool ┊
┊ DfltDisklessStorPool ┊ s32  ┊ DISKLESS ┊          ┊              ┊               ┊ False        ┊ Ok    ┊ s32;DfltDisklessStorPool ┊
┊ DfltDisklessStorPool ┊ s33  ┊ DISKLESS ┊          ┊              ┊               ┊ False        ┊ Ok    ┊ s33;DfltDisklessStorPool ┊
┊ DfltDisklessStorPool ┊ s34  ┊ DISKLESS ┊          ┊              ┊               ┊ False        ┊ Ok    ┊ s34;DfltDisklessStorPool ┊
┊ DfltDisklessStorPool ┊ s35  ┊ DISKLESS ┊          ┊              ┊               ┊ False        ┊ Ok    ┊ s35;DfltDisklessStorPool ┊
┊ DfltDisklessStorPool ┊ s36  ┊ DISKLESS ┊          ┊              ┊               ┊ False        ┊ Ok    ┊ s36;DfltDisklessStorPool ┊
┊ incuspool            ┊ s34  ┊ ZFS      ┊ tank     ┊     7.12 TiB ┊      7.25 TiB ┊ True         ┊ Ok    ┊ s34;incuspool            ┊
┊ incuspool            ┊ s35  ┊ ZFS      ┊ tank     ┊     7.12 TiB ┊      7.25 TiB ┊ True         ┊ Ok    ┊ s35;incuspool            ┊
┊ incuspool            ┊ s36  ┊ ZFS      ┊ tank     ┊     7.12 TiB ┊      7.25 TiB ┊ True         ┊ Ok    ┊ s36;incuspool            ┊
╰────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯

Next I configured the incus nodes (s31,s32,s33) to communicate to the respective controller. Each incus nodes are themselves controller, so I used the localhost as the IP address to communicate between the incus and controller nodes:

root@s31:~$ incus config show --target s31
config:
  core.https_address: 0.0.0.0:8443
  storage.linstor.controller_connection: http://localhost:3370
root@s31:~$ incus config show --target s32
config:
  core.https_address: 0.0.0.0:8443
  storage.linstor.controller_connection: http://localhost:3370
root@s31:~$ incus config show --target s33
config:
  core.https_address: 0.0.0.0:8443
  storage.linstor.controller_connection: http://localhost:3370

I then created the storage pool on Incus, specifying the linstor.resource_group.storage_pool option to ensure that LINSTOR uses the incuspool storage pool for the volumes:

binda@s31:~$ incus storage create sharedpool linstor --target s31
Storage pool sharedpool pending on member s31
binda@s31:~$ incus storage create sharedpool linstor --target s32
Storage pool sharedpool pending on member s32
binda@s31:~$ incus storage create sharedpool linstor --target s33
Storage pool sharedpool pending on member s33
binda@s31:~$ incus storage create sharedpool linstor linstor.resource_group.storage_pool=incuspool
Error: failed to notify peer 192.168.1.233:8443: 404 Not Found
binda@s31:~$ 

The call failed, and the storage pool reports the error:

root@s31:~$ incus storage show sharedpool
config:
  drbd.auto_add_quorum_tiebreaker: "true"
  drbd.on_no_quorum: suspend-io
  linstor.resource_group.name: sharedpool
  linstor.resource_group.place_count: "2"
  linstor.resource_group.storage_pool: incuspool
  linstor.volume.prefix: incus-volume-
  volatile.pool.pristine: "true"
description: ""
name: sharedpool
driver: linstor
used_by: []
status: Errored
locations:
- s31
- s33
- s32 

Incus has correctly created the resource group:

root@s31:~$ linstor resource-group list
╭──────────────────────────────────────────────────────────────────────────────────────╮
┊ ResourceGroup ┊ SelectFilter              ┊ VlmNrs ┊ Description                     ┊
╞══════════════════════════════════════════════════════════════════════════════════════╡
┊ DfltRscGrp    ┊ PlaceCount: 2             ┊        ┊                                 ┊
╞┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄╡
┊ sharedpool    ┊ PlaceCount: 2             ┊        ┊ Resource group managed by Incus ┊
┊               ┊ StoragePool(s): incuspool ┊        ┊                                 ┊
╰──────────────────────────────────────────────────────────────────────────────────────╯
root@s31:~$ 

I was not able to find any flaw in my configuration, and sadly, this is as far as my knowledge goes.

I would like to know if any of you knows better …

Thanks,
Giovanni

Hi!

Yeah, that’s where your mistake is :slight_smile:
Only one controller can be active at a time. So you’re essentially configuring your 3-node cluster in a way that only one Incus node is able to talk to the actual controller. See The LINSTOR User Guide §1.3.1.

You need to either have a single controller, or to configure them in HA. This is quite a painful process but I can guide you through it if you have trouble following the guide.

Hi Benjamin,

thanks for your attention.

I see, so I have reconfigured my LINSTOR node list and, for good measure, restarted all the LINSTOR nodes. The LISNTOR node list now looks like:

root@s31:~$ linstor node list
╭──────────────────────────────────────────────────────╮
┊ Node ┊ NodeType  ┊ Addresses                ┊ State  ┊
╞══════════════════════════════════════════════════════╡
┊ s31  ┊ COMBINED  ┊ 10.99.1.231:3366 (PLAIN) ┊ Online ┊
┊ s34  ┊ SATELLITE ┊ 10.99.1.234:3366 (PLAIN) ┊ Online ┊
┊ s35  ┊ SATELLITE ┊ 10.99.1.235:3366 (PLAIN) ┊ Online ┊
┊ s36  ┊ SATELLITE ┊ 10.99.1.236:3366 (PLAIN) ┊ Online ┊
╰──────────────────────────────────────────────────────╯
root@s31:~$ 

Re-created the storage pools in each satellite node:

root@s31:~$ linstor sp list
╭────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
┊ StoragePool          ┊ Node ┊ Driver   ┊ PoolName ┊ FreeCapacity ┊ TotalCapacity ┊ CanSnapshots ┊ State ┊ SharedName               ┊
╞════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╡
┊ DfltDisklessStorPool ┊ s31  ┊ DISKLESS ┊          ┊              ┊               ┊ False        ┊ Ok    ┊ s31;DfltDisklessStorPool ┊
┊ DfltDisklessStorPool ┊ s34  ┊ DISKLESS ┊          ┊              ┊               ┊ False        ┊ Ok    ┊ s34;DfltDisklessStorPool ┊
┊ DfltDisklessStorPool ┊ s35  ┊ DISKLESS ┊          ┊              ┊               ┊ False        ┊ Ok    ┊ s35;DfltDisklessStorPool ┊
┊ DfltDisklessStorPool ┊ s36  ┊ DISKLESS ┊          ┊              ┊               ┊ False        ┊ Ok    ┊ s36;DfltDisklessStorPool ┊
┊ incuspool            ┊ s34  ┊ ZFS      ┊ tank     ┊     7.12 TiB ┊      7.25 TiB ┊ True         ┊ Ok    ┊ s34;incuspool            ┊
┊ incuspool            ┊ s35  ┊ ZFS      ┊ tank     ┊     7.12 TiB ┊      7.25 TiB ┊ True         ┊ Ok    ┊ s35;incuspool            ┊
┊ incuspool            ┊ s36  ┊ ZFS      ┊ tank     ┊     7.12 TiB ┊      7.25 TiB ┊ True         ┊ Ok    ┊ s36;incuspool            ┊
root@s31:~$ 

I have reconfigured (and restarted) the INCUS nodes:

root@s31:~$ incus config show --target s31
config:
  cluster.https_address: 10.99.1.231:8443
  storage.linstor.controller_connection: http://10.99.1.231:3370
root@s31:~$ incus config show --target s32
config:
  cluster.https_address: 10.99.1.232:8443
  storage.linstor.controller_connection: http://10.99.1.231:3370
root@s31:~$ incus config show --target s33
config:
  cluster.https_address: 10.99.1.233:8443
  storage.linstor.controller_connection: http://10.99.1.231:3370
root@s31:~$

Sadly, recreating the incus storage pool looks like it is not a LINSTOR issue but definitely an INCUS one.

root@s31:~$ incus storage create sharedpool linstor --target s31
Storage pool sharedpool pending on member s31
root@s31:~$ incus storage create sharedpool linstor --target s32
Storage pool sharedpool pending on member s32
root@s31:~$ incus storage create sharedpool linstor --target s33
Storage pool sharedpool pending on member s33
root@s31:~$ incus storage create sharedpool linstor linstor.resource_group.storage_pool=incuspool
Error: failed to notify peer 10.99.1.233:8443: 404 Not Found
root@s31:~$ incus storage show sharedpool
config:
  drbd.auto_add_quorum_tiebreaker: "true"
  drbd.on_no_quorum: suspend-io
  linstor.resource_group.name: sharedpool
  linstor.resource_group.place_count: "2"
  linstor.resource_group.storage_pool: incuspool
  linstor.volume.prefix: incus-volume-
  volatile.pool.pristine: "true"
description: ""
name: sharedpool
driver: linstor
used_by: []
status: Errored
locations:
- s31
- s33
- s32
root@s31:~$ linstor resource-group list
╭──────────────────────────────────────────────────────────────────────────────────────╮
┊ ResourceGroup ┊ SelectFilter              ┊ VlmNrs ┊ Description                     ┊
╞══════════════════════════════════════════════════════════════════════════════════════╡
┊ DfltRscGrp    ┊ PlaceCount: 2             ┊        ┊                                 ┊
╞┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄╡
┊ sharedpool    ┊ PlaceCount: 2             ┊        ┊ Resource group managed by Incus ┊
┊               ┊ StoragePool(s): incuspool ┊        ┊                                 ┊
╰──────────────────────────────────────────────────────────────────────────────────────╯
root@s31:~$ 

I guess that the problem is inherent to my setup, in particular:

  1. having the satellites handling the storage not part of the incus cluster
  2. and/or, the controller/satellite on a different machine than the satellites

My next step is to confirm this hypothesis by creating the same setup as the documentation.

Thanks anyway,

Giovanni

You need one local satellite per Incus node. The satellites can be diskless, it’s fine, but your s32 and s33 nodes need to have a satellite. Satellites allow you to actually create virtual DRBD devices, which then are used by the LINSTOR driver to mount your disks onto your nodes.

Right, I forgot about that!

So I added back the 2 nodes:

 root@s31:~$ linstor node list
╭──────────────────────────────────────────────────────╮
┊ Node ┊ NodeType  ┊ Addresses                ┊ State  ┊
╞══════════════════════════════════════════════════════╡
┊ s31  ┊ COMBINED  ┊ 10.99.1.231:3366 (PLAIN) ┊ Online ┊
┊ s32  ┊ SATELLITE ┊ 10.99.1.232:3366 (PLAIN) ┊ Online ┊
┊ s33  ┊ SATELLITE ┊ 10.99.1.233:3366 (PLAIN) ┊ Online ┊
┊ s34  ┊ SATELLITE ┊ 10.99.1.234:3366 (PLAIN) ┊ Online ┊
┊ s35  ┊ SATELLITE ┊ 10.99.1.235:3366 (PLAIN) ┊ Online ┊
┊ s36  ┊ SATELLITE ┊ 10.99.1.236:3366 (PLAIN) ┊ Online ┊
╰──────────────────────────────────────────────────────╯

Finally, I was able to use LINSTOR backed incus pool:

root@s31:~$ incus storage show sharedpool 
config:
  drbd.auto_add_quorum_tiebreaker: "true"
  drbd.on_no_quorum: suspend-io
  linstor.resource_group.name: sharedpool
  linstor.resource_group.place_count: "2"
  linstor.resource_group.storage_pool: incuspool
  linstor.volume.prefix: incus-volume-
  volatile.pool.pristine: "true"
description: ""
name: sharedpool
driver: linstor
used_by:
- /1.0/images/68984567edd6af0ce4e08f4d22ec5fe37f4673bf860be3387c7a1d139e5767e4
- /1.0/instances/c1
- /1.0/storage-pools/sharedpool/volumes/custom/fsvol
status: Created
locations:
- s31
- s33
- s32
root@s31:~$ 

Concerning the HA-setup, I prefer to use an external database provider like PostgreSQL or ETCD to keep our LINSTOR database distributed in a HA-setup.

Do you have some experience about the ETCD, what alternative do you advise ?

Thanks again for your help, much appreciated!

Giovanni

For HA, I just followed the official guide, which stores the LINSTOR database on LINSTOR itself. I can’t help you on your specific request (and I don’t really know if LINSTOR supports that, TBH).

Hello, @giobim!

I’ve had some experience with LINSTOR HA setups. We basically had an active/standby setup for the linstor-controller service using Pacemaker, a VIP and a external PostgreSQL database.

I’ve tried etcd as the external database and didn’t have a good result. The control plane performance was very poor (for reference, the cluster had a few dozen nodes and a few hundred volumes). Besides that, etcd support will be removed in LINSTOR 1.34.0. Given that, I’d recommend using either MySQL or PostgreSQL as the database. I’ve used Postgres and found no issues in my setup.

Hi Luis,

I’ve seen that the etcd was to be removed from LINSTOR. It saved me some time jumping right into the postgresql route. It also would have been the preferred database in production anyway.

Nice to hear someone with LINSTOR and etcd experience.

Thanks, Luis.