Shared Storage from NAS - Small Cluster

chromag · February 6, 2024, 2:33am

I currently have a small setup with XCP-NG. I have 3 nodes that have fairly small boot drives. Two of the nodes are small micro form factor machines.

I have a TrueNAS box that has an iSCSI volume that is available to the XCP-NG cluster nodes as a shared storage location. Most of my VMs use the shared storage location for their VM images. This allows me to migrate them as needed between nodes.

Is there anything like this that would work for Incus?

Sorry for the dumb question! I shifted VMs around to two nodes and installed Incus on the third box and I’m really interested.

simos · February 6, 2024, 9:14am

Welcome to Incus!

Here is the documentation for the storage pools in Incus.

Specifically,

The simplest form should be to use the dir storage pool and point it to the shared storage location.
The next option would be use btrfs or zfs or lvm on a loopback file on the shared storage location.
The third option is to provide Incus with a block device (a partition) on the shared storage (again, one of btrfs or zfs or lvm) so that you can take advantage of their full features.

Use the following to manage storage pools. You can move containers and VM between storage pools.

incus storage list
incus storage volume list myvolumename

chromag · February 6, 2024, 2:52pm

Thanks for the quick reply! I’ll try this out in a small virtual env to see if I can get it working. This was very helpful and appreciated. I’ll update once I test it out.

Thanks again!

chromag · February 6, 2024, 6:59pm

The simplest form should be to use the dir storage pool and point it to the shared storage location.

I tested this out and I’m not sure this will work with shared storage unless I’m misunderstanding or doing something wrong. If I mount a shared storage location for a dir storage pool and create the pool across the instances in the cluster it throws an error after creating on the first host node since files already exist at that location (first node creates, second node fails for existing files).

The next option would be use btrfs or zfs or lvm on a loopback file on the shared storage location.

Haven’t tried this one yet!

The third option is to provide Incus with a block device (a partition) on the shared storage (again, one of btrfs or zfs or lvm) so that you can take advantage of their full features.

Would this work across multiple node instances in a cluster? I tried this and I think I’m running into a similar issue as the dir option. I created an iSCSI volume and did the discover process and added it to each test node. They both show the /dev/sda1 partition. When trying to create a pool across nodes I get the following error:

ERROR: /dev/sda1 appears to contain an existing filesystem (btrfs)

I’m guessing the first node formats and initializes /dev/sda1 so the seond fails?

Would I run into problems trying to access the same iSCSI volumn from multiple machines?

XCP-NG can use iSCSI volumes across a cluster (any host instance can create VMs that utilize the same iSCSI vol) but I have no idea how it actually implements that.

simos · February 6, 2024, 7:26pm

I am not sure how XCP-NG works and how the 3-node cluster is setup. Consider the following as directions to look into.

If all nodes have access to the same block device (i.e. /dev/sda1), then you cannot use the same block device for all three Incus installations for obvious reasons.

If you can assign different block devices to each node, then you can create an Incus cluster.

When you run incus admin init and you want to use a block device with btrfs or zfs, etc, then the block device should be unformated so that Incus does not complain that there is already a filesystem there. Incus wants to see an empty block device.

If you were to use the dir option as a last result, you would mount /dev/sda1 to /mnt/BIGDISK, then create folders /mnt/BIGDISK/NODE1, /mnt/BIGDISK/NODE2 and /mnt/BIGDISK/NODE3 for each node.

chromag · February 6, 2024, 7:43pm

Gotcha! I’m testing in a small virtual env to see how this will all work together. I have two Incus nodes in cluster mode and a small TrueNAS VM acting as a test NAS.

So I should create an iSCSI volume per Incus host node so they all get their own block device. They should all be available via /dev/sda and I can create an empty partition. At that point I can create the storage pool and it should work just fine.

How would this work in a failure situation? Let’s say in my prod setup I have 3 Incus nodes and they all have their own iSCSI volumes mounted from the NAS. I’ve created a mix of system containers and VMs in the cluster.

Let’s say I have instance1, instance2 and instance3 and they’re spread across host1, host2 and host3. I have a hardware failure and host3 goes down which means instance3 is not available.

What would the process be of migrating instance3 over to one of the remaining hosts host1 or host2?

Since host3 is now down would it’s block device be available to the cluster to migrate to a running host?

Thanks for following up! Appreciate the help.

EDIT: It seems the excellent documention answers part of my question?

By default, Incus replicates images on as many cluster members as there are database members. This typically means up to three copies within the cluster.

chromag · February 6, 2024, 10:11pm

It looks like I got this working sort of.

I created 3 volumes (iSCSI) on the NAS for each Incus host in the cluster (each mounted as /dev/sda)
I created an image-pool storage (with btrfs) on all three hosts
I created 3 test instances - instance1 ,instance2, and instance3 and each instance was created on each host (so instance1 is on host1, etc). I included the --storage image-pool option to ensure the instances were created on /dev/sda1

They seemed to work fine. I decided to test out a failure so I forced an immediate power off of host3 and as expected instance3 showed as a state of ERROR and in the cluster list host3 showed the state of OFFLINE.

Reading this section of the documentation it looks like I should evacuate the failed host:

To do so, use the incus cluster evacuate command. This command migrates all instances on the given server, moving them to other cluster members. The evacuated cluster member is then transitioned to an “evacuated” state, which prevents the creation of any instances on it.

Unfortunately this results in the following error:

Error: Failed to update cluster member state: Missing event connection with target cluster member

I’m so close to getting this working! Any ideas? Is an evacuation something that’s supposed to be done while the host is online when there is a scheduled maintenance?

If so, how should I recover from an unscheduled failure of a host? The docs seems to say that Incus replicates images so I assumed I could bring the instances online on another host.

EDIT: Well apparently replication is only for ceph. Found here in the documentation.

For most storage drivers (all except for Ceph-based storage drivers), storage volumes are not replicated across the cluster and exist only on the member for which they were created.

So is there no way to recover from a failure when using shared storage?

Thanks again for the assistance!

victoitor · October 10, 2024, 7:54pm

You can probably use the iSCSI to mount the same volume as a device on each one and then use clustered lvm solution from incus to share that storage between them.

The documentation for clustered lvm might be outdated as a simpler solution was just released in incus 6.6.