Incus cluster storage options using DRBD

In my homelab I am attempting to set up a 2-node cluster (am aware this is not a supported config - just a first step until I get more hardware). I created a DRBD resource across both servers and drbdadm status r0 shows it is fully replicated. The intention was to use an LVM thin pool over DRBD. I have LVM configured with thin pool enabled and the dm_thin_prod kernel module loaded.

On promoting the bootstrap node to be the cluster leader, I went with the option to configure a new local storage pool. When I attempt to join the secondary to the cluster, it fails with the below:

Error: Failed to join cluster: Failed to initialize member: Failed to initialize storage pools and networks: Failed to create storage pool “local”: Failed to run: pvcreate /dev/loop0: exit status 5 (Cannot use /dev/loop0: device is rejected by filter config)

The mention of the loop device has me thinking I’m not understanding something - I expected that the secondary node would use the same /dev/drbd0 resource as the primary on joining the cluster.

I discovered the LVM filter was the issue there, so added an accept for /dev/loop.* devices. After that, the error is:

Error: Failed to join cluster: Failed to initialize member: Failed to initialize storage pools and networks: Failed to create storage pool “local”: Failed to run: pvcreate /dev/loop0: exit status 5 (Error writing device /dev/loop0 at 0 length 2048.
/dev/loop0 not wiped: aborting.)

A couple of questions:

  1. I have seen mention of lvmlockd and sanlock possibly being needed for shared LVM PVs. Unfortunately, these are not currently available on my OS (NixOS). Would their absence explain the error output I’m seeing?
  2. Is there any reasonable alternative to LVM to enable replicated storage over a DRBD device for an Incus cluster? Linstor is also not available for NixOS yet, and I understand ZFS would perform very poorly and likewise Ceph.

In case it’s relevant, the below output is from promoting the cluster primary node.

# incus admin init
Would you like to use clustering? (yes/no) [default=no]: yes
What IP address or DNS name should be used to reach this server? [default=192.168.2.11]:
Are you joining an existing cluster? (yes/no) [default=no]:
What member name should be used to identify this server in the cluster? [default=curly]:
Do you want to configure a new local storage pool? (yes/no) [default=yes]: yes
Name of the storage backend to use (btrfs, dir, lvm, zfs) [default=zfs]: lvm
Create a new LVM pool? (yes/no) [default=yes]: yes
Would you like to use an existing empty block device (e.g. a disk or partition)? (yes/no) [default=no]: yes
Path to the existing block device: /dev/drbd0
Do you want to configure a new remote storage pool? (yes/no) [default=no]: no
Would you like to use an existing bridge or host interface? (yes/no) [default=no]: yes
Name of the existing bridge or host interface: br0
Would you like stale cached images to be updated automatically? (yes/no) [default=yes]:
Would you like a YAML “init” preseed to be printed? (yes/no) [default=no]: yes

And this is from attempting to promote the secondary:

\# incus admin init
Would you like to use clustering? (yes/no) \[default=no\]: yes
What IP address or DNS name should be used to reach this server? \[default=192.168.2.12\]:
Are you joining an existing cluster? (yes/no) \[default=no\]: yes
Please provide join token: eyJzZX<snip>
All existing data is lost when joining a cluster, continue? (yes/no) \[default=no\] yes
Choose “lvm.thinpool_name” property for storage pool “local”:
Choose “lvm.vg_name” property for storage pool “local”:
Choose “source” property for storage pool “local”:

Is it correct that the local storage pool is unique per node and shouldn’t be replicated? I had taken local to mean local to the cluster, but if it is intended to hold node-specific information then it makes sense that this should not be on a replicated block device.

Secondly, is a raw DRBD device (or for that matter, any type of replicated block device operating below/transparently to Incus) even a supported option?

Local storage pools are local to each server.
Only Ceph, Linstor, TrueNAS and LVM cluster are treated as remote and available cluster wide.

DRBD isn’t something that’s supported and using it as backing for local pools will most likely lead to corruption.

1 Like

Linstor is a management layer for DRBD, and fully supported by incus.

It does DRBD-over-LVM rather than LVM-over-DRBD. That is: for each resource you create, it creates a new logical volume on each of the target nodes, and configures a new DRBD device to replicate between them. That way, each individual resource can move its primary and secondary around, without affecting any other resource.

Note: you do need DRBD9 kernel module, but there are free repos for Ubuntu and Proxmox kernels. DRBD9 gives you the additional advantage of allowing more than 2 replicas (up to 32).

1 Like

Except here, there’s quite an explicit requirement on having that run on NixOS.
And DRBD9 and LINSTOR are pretty hard-to-package beasts :slight_smile:

1 Like

Yes, I am fairly novice with nix packaging but I started to look into packaging Linstor. Someone has already raised a PR for the Python client, but that is all so far. With the added complexity of linstor-server having a gradle dependency on linstor-api-java I decided to explore other avenues.

NixOS’ LVM support is also very barebones, with only the possibility of generating a very minimal lvm.conf, and no inbuilt support for lvmlockd or even LVM filters. And there is no package for sanlock, although that looks like it would be easier to achieve.