IncusOS - how to add a new drive to a storage pool?

I have completed the installation of IncusOS on dedicated hardware. Following the intallation, I added a drive which I intend to use as a storage pool for Incus. How can I initialize the drive with ZFS, since there is no shell access. Is there an incus client command that I can use to initialize the drive for storage? Is there a way to list the physical drives on the machine?

There is an API for that at /os/1.0/system/storage. We still need to build a CLI around all of the OS APIs so it’s quite the manual process for now.

You can do incus query /os/1.0/system/storage to see the data and use incus query /os/1.0/system/storage -X PUT -d "$(cat new-value.json)" to push back a configuration.

1 Like

Okay, took me a minute - here are the current results:

First, I’ve uploaded the results of running the query here.

"pools": [
	{
		"devices": [
			"/dev/nvme0n1p11"
		],
		"name": "local",
		"state": "ONLINE",
		"type": "zfs-raid0"
	}
]

I added the block device for the other drive to the existing pool. Updated json here.

"pools": [
	{
		"devices": [
			"/dev/nvme0n1p11",
			"/dev/sda"
		],
		"name": "local",
		"state": "ONLINE",
		"type": "zfs-raid0"
	}
]

A couple of questions:

  • I didn’t do anything to format the drive. I’m reusing this drive from the server’s previous configuration, so there may be existing partitions. Does Incus apply partitions or wipe drives?

  • I didn’t specify any partition, just the whole drive. Is this correct?

I get the following error when attempting to add the configuration to the server.

$ incus query IncusOS:/os/1.0/system/storage -X PUT -d "$(cat incus-storage.json)"
Error: no ZFS pool configuration provided

Am I missing something obvious here?

Okay, looking at the code - it appears that that error is returned if I am empty body is provided in a put.

Oh - appears here as well:

This is the more likely code path - some problem in parsing the input.

At the moment IncusOS assumes that any drives being added to a storage pool are unformatted. ZFS can be given an option to force-add a drive with existing partitions, but we don’t currently use that to prevent unintentional data loss if the wrong drive is specified. You can easily wipe a drive with blkdiscard prior to adding it to the IncusOS machine. (It may be worthwhile for us to add a dedicated API that will wipe a drive that’s not part of a storage pool; I’ll make an issue for that on GitHub.)

ZFS does like to use the whole device, so yes, it’s correct to use the whole drive.

You need to send back the relevant pool configuration in the config part of the API struct:

{
	"config": {
		"pools": [
			{
				"devices": [
					"/dev/nvme0n1p11",
					"/dev/sda"
				],
				"name": "local",
			}
		]
	}
}

However, the “local” ZFS pool is special and you can’t update/delete it. This is because it’s a bit different than all other storage pools, as it utilizes a partition to fill the rest of the disk space on the install device. It’s really scratch space, and the assumption is that adding additional local storage pools will utilize dedicated drives likely using raidz or mirroring for data safety.

1 Like

Very helpful, thanks! I will take your suggestions to get this working and will update the thread.

Okay - I had to mess around a little bit to get the drive working, and then my BIOS “helpfully” refused to boot until it told me I needed to turn VMD back on (which I ignored).

Finally, booted back up with the SSD present and enabled but empty. I then created a new JSON file with the following contents:

{
	"config": {
		"pools": [
			{
				"devices": [
					"/dev/sda"
				],
				"name": "incus-pool",
				"type": "zfs-raid0"
			}
		]
	}
}

When I executed $ incus query IncusOS:/os/1.0/system/storage -X PUT -d "$(cat incus-storage.json)" Nothing was returned, but I when I re-ran the storage query, I saw the pool.

$ incus query IncusOS:/os/1.0/system/storage
{
	"config": {},
	"state": {
		"drives": [
			{
				"boot": false,
				"bus": "sat",
				"capacity_in_bytes": 2000398934016,
				"id": "/dev/sda",
				"member_pool": "incus-pool",
				"model_family": "Samsung based SSDs",
				"model_name": "Samsung SSD 870 QVO 2TB",
				"remote": false,
				"removable": false,
				"serial_number": "S5VWNJ0R925978W",
				"smart": {
					"enabled": true,
					"passed": true
				},
				"wwn": "0x5002538f3191b08c"
			},
			{
				"boot": true,
				"bus": "nvme",
				"capacity_in_bytes": 240057409536,
				"id": "/dev/nvme0n1",
				"model_family": "",
				"model_name": "Force MP510",
				"remote": false,
				"removable": false,
				"serial_number": "214382480001291730B0",
				"smart": {
					"enabled": true,
					"passed": true
				}
			}
		],
		"pools": [
			{
				"devices": [
					"/dev/nvme0n1p11"
				],
				"name": "local",
				"state": "ONLINE",
				"type": "zfs-raid0"
			},
			{
				"devices": [
					"/dev/sda"
				],
				"name": "incus-pool",
				"state": "ONLINE",
				"type": "zfs-raid0"
			}
		]
	}
}

Thanks for the very helpful pointers!

One thing that would have been very helpful for me, since the machine is ultimately going in my basement, would have been a remote reboot command. I don’t know if there is a plan to implement api wrappers around some other system commands, but an /os/1.0/system/command/reboot endpoint or something similar would be very helpful! This may already be on the roadmap.

incus query IncusOS:/os/1.0/system -X PUT -d '{"action": "reboot"}'

:slight_smile:

1 Like

PUT /os/1.0/system is a bit of a hack I’ve put in place for my own use so far so expect it to change a bit down the line for something cleaner :wink:

But yeah, you can use that endpoint with:

  • reboot
  • update
  • shutdown
  • poweroff (same as shutdown)
1 Like

Any updates to allow forcing the creation of a drive in a local pool?

I get we can’t use local but it’s a single drive that had a zpool on it
 background:

I’m using some lenovo m700 mini computers with a SATA M.2 for the OS and a SATA 3.5” for the local data. I was using it with Debian13 and thought I would give IncusOS a try
 so far it’s all friction.

I wiped most of the data drives with wipefs prior to the install but maybe missed one so when adding cluster node3 it said “exit status 1 (cannot create ‘local’: pool already exists)”
I tried incus admin os system storage wipe-drive -d ‘{“id”:“/dev/disk/by-id/wwn-
”}’ but got impatient (slow spinning drive) and canceled (read: powered cycled the node)

Now I get

... exit status 1 (invalid vdev specification
use '-f' to override the following errors:
/dev/disk/by-id/wwn-... contains a corrupt primary EFI label.)

which is super fun :wink:

This is just a home lab for testing but it feels like my options are:

  • boot into some other OS with wipefs or blkdiscard and clear out the disk
  • let the wipe-disk operation run the billion hours (500GB @ 5400rpm)
  • pull request some modification to allow us to add the -f flag for zpool option
  • pull request some other half baked disk killing feature you guys don’t want :wink:

When exactly did you get the error about the “local” pool existing? (During install, first boot, after first boot but prior to adding the node to your cluster, etc?) The installer does perform a complete wipe of the install target, but creation of the “local” zpool doesn’t occur until first boot. If another drive(s) happened to already contain a zpool named “local” we could get that error. (I’m not sure right off hand how we might properly differentiate between an IncusOS-created “local” pool and a pre-existing one without setting some sort of custom attribute, and I don’t know if we want to do that.)

IncusOS uses the same code as Incus when wiping a device. We attempt efficient ways first (blkdiscard), but for spinning disks we may have no choice but to manually zero out the entire storage device. We choose the trade-off of security vs speed when instructed to wipe a device.

So, to get back to creating the pool you want, you could pull the drive and run a sgdisk -Z /dev/sdX on another machine to wipe the GPT tables which should be “close enough” for IncusOS to be happy, or re-trigger the wipe via the IncusOS API and wait for it to complete.

We definitely don’t want to be throwing around the -f flag to zfs commands, as that makes it far too easy to unexpectedly test your backup strategies. :wink:

1 Like

Perfectly said and sorry for light details.

I got those messages when attempting to add the secondary/tertiary nodes to the cluster.

Currently I have moved forward with wiping it all and starting from scratch:

  • Debian13 Live USB to wipefs both OS and 2nd drives
  • in BIOS disabled 2nd drives
  • installed with default config to node1
  • installed without default config on nodes2/3
  • enable 2nd drive
  • go to my client (mac os)

Using the instructions from Creating an Incus cluster - IncusOS documentation I was able to get them all added as remotes and enable clustering on node1
 then I made the grave error:

when the installer asks you Choose "zfs.pool_name" property for storage pool "local": I thought I should typer local (or even leave it blank like use a default or something) which I think was the issue. It adds zpool stuff to the drive then dies since you can’t add to the “special” local pool as you’ve said in this thread.

21:21:58 ~ $ incus cluster join m700s: node2:
What IP address or DNS name should be used to reach this server? [default=10.123.0.12]:
What member name should be used to identify this server in the cluster? [default=0a3cd58c-a6ff-11e6-8813-9dd731041900]: node2
All existing data is lost when joining a cluster, continue? (yes/no) [default=no] yes
Choose "source" property for storage pool "local": /dev/disk/by-id/wwn-0x5000c5009dea73ec
Choose "zfs.pool_name" property for storage pool "local": local
Error: Failed to join cluster: Failed to initialize member: Failed to initialize storage pools and networks: Failed to create storage pool "local": Failed to run: zpool create -m none -O compression=on local /dev/disk/by-id/wwn-0x5000c5009dea73ec: exit status 1 (cannot create 'local': pool already exists)


21:24:09 ~ $ incus cluster join m700s: node2:
What IP address or DNS name should be used to reach this server? [default=10.123.0.12]:
What member name should be used to identify this server in the cluster? [default=0a3cd58c-a6ff-11e6-8813-9dd731041900]: node2
All existing data is lost when joining a cluster, continue? (yes/no) [default=no] yes
Choose "source" property for storage pool "local": /dev/disk/by-id/wwn-0x5000c5009dea73ec
Choose "zfs.pool_name" property for storage pool "local": int
Error: Failed to join cluster: Failed to initialize member: Failed to initialize storage pools and networks: Failed to create storage pool "local": Failed to run: zpool create -m none -O compression=on int /dev/disk/by-id/wwn-0x5000c5009dea73ec: exit status 1 (invalid vdev specification
use '-f' to override the following errors:
/dev/disk/by-id/wwn-0x5000c5009dea73ec-part1 is part of potentially active pool 'local')

The subsequent try was too little too late and now I’m in a bad spot
. if using wipe-drive or whatever it was taking way longer than:"

  • reboot into bios to disable Secure Boot
  • boot debian13 live usb
  • run wipefs -a on whichever the 2nd drive is
  • reboot into bios to enable Secure Boot
  • boot into IncusOS

Back to the client choosing a name like “int” and success!

21:31:26 ~ $ incus cluster join m700s: node2:
What IP address or DNS name should be used to reach this server? [default=10.123.0.12]:
What member name should be used to identify this server in the cluster? [default=0a3cd58c-a6ff-11e6-8813-9dd731041900]: node2
All existing data is lost when joining a cluster, continue? (yes/no) [default=no] yes
Choose "source" property for storage pool "local": /dev/disk/by-id/wwn-0x5000c5009dea73ec
Choose "zfs.pool_name" property for storage pool "local": int

Thank you for the prompt response and also the great product!