Thoughts on stabilising ZFS delegation and migration together with encryption

yala · November 1, 2025, 1:59am

Reading through the Incus documentation and source code I’ve come to reflect a bit about the interplay of ZFS encryption with migration and also delegation. The dataset of the storage volume carrying a system container instance can contain further datasets, which now can bring their own encryptionroot or keylocation. This poses difficulties for automatic replication of instances/datasets across storage pools and Incus nodes, which in the past already surfaced here and there.

This document is mainly a collection of resources on the subject, an investigation into how they play together and a sketch of possible resolution vectors that come to mind.

Current situation
Edge cases
Resolution vectors

Hoping that following this reflection we end up with a better understanding of the different parts at play.

Current situation

The Incus ZFS driver is a wrapper with custom logic around the ZFS binaries zfs and zpool. It inherits all of their side effects. In it there are special handlers for volumes and migrations.

Limitations of the ZFS driver

It has few limitations with regards to restoring from older snapshots, observing I/O quotas and feature support of different ZFS versions.

Encryption of a storage pool

The ZFS encryption of a storage pool is currently transparent to incus, as long as it is unlocked beforehand.

We will take this presumption at face value later.

Replication of an instance to another host

In daily operation, an Incus server administrator deals with preparation for desaster recovery scenarios. This includes maintaining several instance copies of various forms. We are primarily looking at using

copy, which gives us the means to act on “instances within or in between servers”. The --refresh* and --stateless options control the flow of the operation. It would be nice if we could schedule replication like snapshots.

Other ways to move data between Incus hosts

Instead of building soley on snapshotting, a full import/export cycle may be preferable for a complete backup. Instances and storage volumes (and storage buckets, not as relevant here) can be im- and exported.

Instances: import, export
Volumes: storage volume import, storage volume export
(Buckets: storage bucket import, storage bucket export)

Please make sure to encrypt your backups separately, which is highly suggested. Here we want to focus on the replication case before, as it involves the delicate situation of native encryption handling in a distributed setting.

Migration of an instance to another host

Concerning the migration of an instance, here we can focus on the case to move existing Incus instances between servers.

move also operates on “instances within or in between servers”. The --instance-only and --stateless options control the flow here.

Snapshots of an instance and volumes

snapshot helps to maintain the lifecycle of instance snapshots. This becomes especially useful around and together with a snapshot restore.

Individual snapshots of the instance and storage volumes can be handled independently with the storage volume snapshot command. E.g. for a lightweight way to to back up custom storage volumes.

Replication and migration of volumes to another host

Using sending and receiving ZFS dataset snapshots for the mechanics, Incus also knows to move or copy storage volumes between Incus servers.

The transfer of volumes can similar to above for instances be conducted with commands named storage volume copy and storage volume snapshot move.

The storage volume copy command knows the same --refresh* parameters as above, which similarly qualifies it for automatic scheduling.

Internals

The Incus migration API covers clients and the server. The server also contains a lower-level ZFS driver, to which we return at the end. The high-level API is really concise.

High-level definitions and invocations of the migration API

incus/internal/server/migration/migration_volumes.go at 5d701f72540897b947d75c45a814f0379d5f3f4c · lxc/incus · GitHub

incus/internal/server/storage/backend.go at 5d701f72540897b947d75c45a814f0379d5f3f4c · lxc/incus · GitHub

incus/client/incus_instances.go at 5d701f72540897b947d75c45a814f0379d5f3f4c · lxc/incus · GitHub

incus/cmd/incusd/instances_post.go at 5d701f72540897b947d75c45a814f0379d5f3f4c · lxc/incus · GitHub

incus/cmd/incus/move.go at 5d701f72540897b947d75c45a814f0379d5f3f4c · lxc/incus · GitHub

The tests show us how its being used.

Watch out for an easter egg in the first file migration.sh.

incus/test/suites/migration.sh at main · lxc/incus · GitHub

https://github.com/lxc/incus/blob/main/test/suites/clustering_move.sh#L80

incus/test/suites/container_move.sh at main · lxc/incus · GitHub

The tests do not distinguish between separate cases for un-/encrypted source and/or target volumes with or without recursive datasets. In case encryption is present, we also distinguish between with or without encryption keys loaded.

Volume: source, target
Recursive: yes, no
Encryption: not present, key not available, key available

This equates to 3 ✕ 2 ✕ 2 equals 12 possible such cases. Please correct, when wrong.

In the case of recursive datasets, e.g. with using delegation, creative combinations of encryptionroot and keystatus on host and guest system can be assumed. This point will return again.

Edge cases

Both Incus and ZFS bring edge cases with their implementations, whose side-effects are not strictly isolated from each other and cannot be. This is due to the tight coupling of the Incus storage pool and volume mechanics together with the pool and dataset mechanics of the underlying file system.

While the high-level Incus API streamlines operations by making educated opinionated choices, it also carries a weight of presumptions, which eventually do not hold in all potential use cases, esp. with regards to those employing ZFS encryption. Incus does not offer an API to manipulate encryption properties of a storage pool.

One problematic aspect of this is the perceived general instability of the ZFS encryption implementation, which can only be called rather incomplete from an operational point of view. The main hindrances are, that the Initialisation Vector (IV) of a dataset is tied to its lifetime and it is not possible to provide alternative key slots. It is often also not known that inheriting a key does not just tie the secret for opening the key slot to its descendants, in the sense that it is merely referenced and not copied, but that descendant datasets become tied to the Initialisation Vector only present in their encryptionroot dataset higher up.

These two factors together often work against each other in cases when an Incus migration meets encrypted pools.

Incus constraints

When a ZFS version is sufficiently new, all transfers will be considered raw (-w).

github.com/lxc/incus

internal/server/storage/drivers/driver_zfs.go

5d701f725


      
          	// If running 0.8.0 or newer, we can use direct I/O, trim and raw.
          	if ourVer.Compare(ver080) >= 0 {
          		zfsDirectIO = true
          		zfsTrim = true
          		zfsRaw = true
          	}

This poses the first challenge for encrypted workloads.

Any send of a dataset with sub-datasets will be provided as a replication stream package (-R):

github.com/lxc/incus

internal/server/storage/drivers/driver_zfs_utils.go

main


      
          	// Check if nesting is required.
          	// We only want to use recursion (and possible raw) mode if required as it can interfere with ZFS encryption.
          	if d.needsRecursion(dataset) {
          		args = append(args, "-R")
          
          		if zfsRaw {
          			args = append(args, "-w")
          		}
          	}

We find this pattern in other places, e.g.

github.com/lxc/incus

internal/server/storage/drivers/driver_zfs_volumes.go

5d701f725


      
          		// Check if nesting is required.
          		if d.needsRecursion(d.dataset(src, false)) {
          			args = append(args, "-R")
          
          			if zfsRaw {
          				args = append(args, "-w")
          			}
          		}

This poses the second challenge in (partially) encrypted environments.

ZFS constraints

ZFS is not a distributed file system per se–see former OpenEBS cStor–and all snapshot and replication handling has to be done by external systems. ZFS itself will not life cycle snapshots, sends and receives for you, but it will conduct them.

We know that encryption handling for raw replication streams comes with dangers and pitfalls that have in the past led to loss of user data. Prominent examples were given above.

The details of these currently open issues are worrysome, but insightful:

What is especially worrysome, that there are so many ongoing problems with ZFS encryption here and above, with no available resources being dedicated to the case right now. The encryption code can as well be considered unmaintained, as noone is directly appointed to it.

In other places it was said that this part of Incus code is fairly new. Also the existence of issues like Repair encryption hierarchy of 'send -Rw | recv -d' datasets that do not bring their encryption root · Issue #12000 · openzfs/zfs · GitHub , in which receiving using -d (sister of -e) for a while led to raw sends of replication packages without their encryptionroot and with that without access to their IV. Tricky, but fixed meanwhile and not in use by Incus. But a dunning example for just how recent fixes and ongoing concerns with the implementation of ZFS encryption are, despite it perfoming well within its bounds.

Both Incus and ZFS are not stable around handling replication and encryption together, each one amplifying the downsides of the other. Careful deliberation might help to identify some cases, in which we can work around or beyond these limitations. We’ve shown many modifyable constraints above. Which can we maybe loosen to bring movement into the situation?

Forward

With a bit of forum exegesis and with the background of former and known regressions, we might be able to boil down the actual error condition seen on replicated instances, where the target dataset does not decrypt despite available key material, due to a possible loss of the original IV.

Side-effects and perceived regressions

Let’s consider this source layout:

mpool                # per convention unencrypted pool, encryptionroot -
mpool/USERDATA       # second-level encryptionroot, per convention, holds IV
mpool/USERDATA/incus # Incus storage pool root, encryptionroot mpool/USERDATA
mpool/USERDATA/incus/containers               # encryptionroot mpool/USERDATA
mpool/USERDATA/incus/containers/instance      # encryptionroot mpool/USERDATA

And this target layout:

npool                # per convention unencrypted pool, encryptionroot -
npool/USERDATA       # encryptionroot, holds _different_ IV or unencrypted
npool/USERDATA/incus                     # encryptionroot npool/USERDATA or unencrypted
npool/USERDATA/incus/containers          # encryptionroot npool/USERDATA or unencrypted
npool/USERDATA/incus/containers/instance # replicated instance, raw

The replicated instance npool/USERDATA/incus/containers/instance needs the mpool/USERDATA IV to decrypt, which isn’t available. Is it possible that we keep keymaterial for unlocking the IV and itself intact during copy and move operations, but detach it from its encryption hierarchy during the raw send?

Would that hypothesis hold? Can we find counter-examples?

Returning to the introductory quote, can we still assume that:

I hope this post contributes to shining light at some edge cases that we invite for when using raw sends with encrypted datasets.

Here we are ultimately blocked by the lack of key material tooling in ZFS. Every LUKS admin is used to keep a backup of their encryption headers. Key exchange, eventually through a KMS like OpenBao, else would seem a possible option.

Could the availability of delegation also mean, that during initial creation of an instance’s dataset the storage driver provisions its own key material and puts the encryptionroot right at the level of the dataset, in so the IV does never leave it? Would that count as a confidential/trusted compute hardening of the Incus cloud plattform?

Implicitly Incus does now allow users to bring their own keys, which complicates the situation even further. The key will naturally be available to the host system, but a guest cannot access other users keys. This is now a valid course of action:

$ incus storage show default | yq '@json' | jq '{config, driver}' | yq -P
config:
  source: rpool/ROOT/ubuntu_d4psvq/var/incus
  volatile.initial_source: rpool/ROOT/ubuntu_d4psvq/var/incus
  volume.zfs.delegate: "true"
  zfs.pool_name: rpool/ROOT/ubuntu_d4psvq/var/incus
driver: zfs
$ incus launch images:ubuntu/noble u1 -c security.nesting=true -c security.syscalls.intercept.mknod=true -c security.syscalls.intercept.setxattr=true
$ incus exec u1 -- bash

Continuing inside:

root@u1:~# apt update
root@u1:~# apt upgrade -y
root@u2:~# apt install -y curl wget
root@u1:~# curl -fsSL https://pkgs.zabbly.com/key.asc -o /etc/apt/keyrings/zabbly.asc
root@u1:~# sh -c 'cat <<EOF > /etc/apt/sources.list.d/zabbly-incus-stable.sources
Enabled: yes
Types: deb
URIs: https://pkgs.zabbly.com/incus/stable
Suites: $(. /etc/os-release && echo ${VERSION_CODENAME})
Components: main
Architectures: $(dpkg --print-architecture)
Signed-By: /etc/apt/keyrings/zabbly.asc

EOF'
root@u1:~# apt update
root@u1:~# apt install -y incus zfsutils-linux jq yq
root@u1:~# zfs list -Ho name /
rpool/ROOT/ubuntu_d4psvq/var/incus/containers/u1
root@u1:~# zfs create -o canmount=off -o encryption=on -o keylocation=prompt -o keyformat=passphrase rpool/ROOT/ubuntu_d4psvq/var/incus/containers/u1/USERDATA
Enter new passphrase:
Re-enter new passphrase:
root@u1:~# zfs get -Ho value keystatus rpool/ROOT/ubuntu_d4psvq/var/incus/containers/u1/USERDATA
available

Back on the host system, we see the same, which is always to be kept in mind:

$ zfs get -Ho value keystatus rpool/ROOT/ubuntu_d4psvq/var/incus/containers/u1/USERDATA
available

Continuing inside:

root@u1:~# systemctl stop incus.service incus.socket # not yet initialised; just in case to start fresh
root@u1:~# rm -rf /var/lib/incus
root@u1:~# zfs create -o mountpoint=/var/lib/incus rpool/ROOT/ubuntu_d4psvq/var/incus/containers/u1/USERDATA/incus
root@u1:~# zfs create rpool/ROOT/ubuntu_d4psvq/var/incus/containers/u1/USERDATA/incus/storage-pools
root@u1:~# zfs create rpool/ROOT/ubuntu_d4psvq/var/incus/containers/u1/USERDATA/incus/storage-pools/s1
root@u1:~# systemctl start incus.service incus.socket
root@u1:~# incus admin init --preseed <<< '
config: {}
networks:
- name: incusbr0
  type: bridge
  config:
    ipv4.address: auto
    ipv6.address: auto
storage_pools:
- config:
    source: rpool/ROOT/ubuntu_d4psvq/var/incus/containers/u2/USERDATA/incus/storage-pools/s1
  description: ""
  name: default
  driver: zfs
profiles:
  - devices:
      eth0:
        name: eth0
        network: incusbr0
        type: nic
      root:
        path: /
        pool: default
        type: disk
    name: default
cluster: null
'
root@u1:~# incus storage show default | yq '@json' | jq 'fromjson | {config, driver}' | yq -y
config:
  source: rpool/ROOT/ubuntu_d4psvq/var/incus/containers/u1/USERDATA/incus/storage-pools/s1
  volatile.initial_source: rpool/ROOT/ubuntu_d4psvq/var/incus/containers/u1/USERDATA/incus/storage-pools/s1
  zfs.pool_name: rpool/ROOT/ubuntu_d4psvq/var/incus/containers/u1/USERDATA/incus/storage-pools/s1
driver: zfs
# We cannot delegate ZFS to it this level down and skip it.
root@u1:~# incus launch images:ubuntu/noble u2 -c security.nesting=true -c security.syscalls.intercept.mknod=true -c security.syscalls.intercept.setxattr=true
Launching u2
root@u1:~# zfs get -Ho value encryptionroot rpool/ROOT/ubuntu_d4psvq/var/incus/containers/u1/USERDATA/incus/storage-pools/s1/containers/u2
rpool/ROOT/ubuntu_d4psvq/var/incus/containers/u1/USERDATA

We find that the encryptionroot and with it the IV of the u2 container’s dataset on the u1 incus host lives outside of the root of its storage pool. When the Incus host in the u1 container generates and sends a replication stream package from the u2 dataset, its IV at the encryptionroot is lost and the dataset from a replicated Incus instance from an originally encrypted dataset cannot be unlocked anymore. There is no tooling that allows to work around this.

This kind of nested setup becomes useful, when a someone wants to offer their users the same level of logical separation and flexibility that they use themselves.

root@u1:~# incus launch images:fedora/43 f1 -c security.nesting=true -c security.syscalls.intercept.mknod=true -c security.syscalls.intercept.setxattr=true
root@u1:~# incus exec f1 sh -- -c 'dnf install --assumeyes podman'
root@u1:~# incus exec f1 sh -- -c 'podman run hello-world | head -n 1'
!... Hello Podman World ...!

One way forward that meets with the constraints posed by Incus, see following quote, is having it deal with encryption itself by providing (separate) keymaterial per creation of an instance’s dataset. Then a new IV is also generated and together with the rest of the dataset replicated within the replication stream package. The replicated dataset of an encrypted instance can later successfuly be unlocked, is the assumption.

Another crude way then could be to create the encryptionroot in the dataset for the Incus storage pool. Given careful planning, this dataset could be raw-replicated across Incus hosts, before they initialise their storage pools on them. They will all carry the same IV and thus allow their child datasets to use it for decryption as their interchangeable encryptionroot. Which in theory would also be available to datasets of migrated Incus instances.

Assuming the original hypothesis stands uncorrected and this is the behaviour that is at work here.

Commenting on the first premise, maybe there actually are situations, where trust exists? The scenarios described here often came from people in posession of valid credentials. Between my own hosts (in a cluster) there is already a trust relationship and I’m reinforcing it by providing additional proof with the supplied key material. Eventually some of the guarantees that Incus tries to achieve here can be shifted towards adding encryption into the equation.

If the zero trust requirement can be pushed one layer down, to ZFS, it would open up the restrictions and constraints of the higher-level implementation, Incus. Which could, for some cases, free us from the assumption, that we always want to conduct raw sends. Regular incremental sends, with decrypting in transit and reencrypting the dataset at rest when keys are loaded on both ends, work just fine, but need external scheduling and esp. recursive enumeration of snapshots on source and target with their respective incremental ranges.

Encrypted ZFS dataset: incus copy --refresh fails

this is something I’ve tried to resolve in the past without too much success.

When a dataset is encrypted, you can transfer it over but it will be transferred encrypted without the target server knowing how to decrypt it, at least not without a manual zfs load-key being run for the dataset on target.

The reason is that during refresh we need to:

Revert the target to the most recent snapshot

Transfer any new snapshots

Transfer a temporary migration snapshot for the current state of the dataset

Get rid of the temporary snapshot

The revert isn’t possible as it needs access to encrypted data. The rest would be fine, so if all we were doing is transfer or remove snapshots, that’d be fine, but the fact that refresh also needs to sync the state of the dataset itself is what’s causing issues.

It’s possible that there’s something we can do in the event where the key for the dataset is already loaded on the target but it’s definitely pretty tricky

Maybe the assumption that load-key would be sufficient for a raw encrypted dataset on a remote machine does not hold up? “without too much success” and “pretty tricky” suggest the mental model used when developing against this surface didn’t often match up with reality. Was IV placement considered?

A physical revert, if I understand it correctly, happens with rolling back a dataset. This worked:

root@u1:/srv# zfs mount rpool/ROOT/ubuntu_d4psvq/var/incus/containers/u1/ROOT
cannot mount 'rpool/ROOT/ubuntu_d4psvq/var/incus/containers/u1/ROOT': encryption key not loaded
root@u1:/srv# zfs list -t snap rpool/ROOT/ubuntu_d4psvq/var/incus/containers/u1/ROOT
NAME                                                      USED  AVAIL  REFER  MOUNTPOINT
rpool/ROOT/ubuntu_d4psvq/var/incus/containers/u1/ROOT@0   144K      -   192K  -
rpool/ROOT/ubuntu_d4psvq/var/incus/containers/u1/ROOT@1   112K      -   228K  -
rpool/ROOT/ubuntu_d4psvq/var/incus/containers/u1/ROOT@2     0B      -   264K  -
root@u1:/srv# zfs rollback -r rpool/ROOT/ubuntu_d4psvq/var/incus/containers/u1/ROOT@1
root@u1:/srv# zfs list -t snap rpool/ROOT/ubuntu_d4psvq/var/incus/containers/u1/ROOT
NAME                                                      USED  AVAIL  REFER  MOUNTPOINT
rpool/ROOT/ubuntu_d4psvq/var/incus/containers/u1/ROOT@0   144K      -   192K  -
rpool/ROOT/ubuntu_d4psvq/var/incus/containers/u1/ROOT@1     0B      -   228K  -

Did I get something wrong? Maybe encrypted --refresh syncs can happen fine as well?

Is this observation eventually another occurrence of mixing up the keyslot of a dataset with its IV, which will be different on both servers, even when set up with the same key material?

Potential resolution vectors

The methods concerned with sending and receiving are:

They are called from:

Could making them encryption-aware possibly help with working around some of the deficiencies in using Incus together with encrypted Incus pools, wherever IV and encryptionroot may live?

That’s what I was wondering. Many thanks for your interest and keep up the good spirit.

yala · November 1, 2025, 12:08pm

This kind of setup is not without side-effects:

github.com/lxc/incus

/var/lib/incus dataset of an encrypted, delegated guest mounted on top of host mount

opened 12:06PM - 01 Nov 25 UTC

almereyda

### Is there an existing issue for this? - [x] There is no existing issue for t…his bug ### Is this happening on an up to date version of Incus? - [x] This is happening on a supported version of Incus ### Incus system details ```yaml config: images.auto_update_interval: "0" api_extensions: - storage_zfs_remove_snapshots - container_host_shutdown_timeout - container_stop_priority - container_syscall_filtering - auth_pki - container_last_used_at - etag - patch - usb_devices - https_allowed_credentials - image_compression_algorithm - directory_manipulation - container_cpu_time - storage_zfs_use_refquota - storage_lvm_mount_options - network - profile_usedby - container_push - container_exec_recording - certificate_update - container_exec_signal_handling - gpu_devices - container_image_properties - migration_progress - id_map - network_firewall_filtering - network_routes - storage - file_delete - file_append - network_dhcp_expiry - storage_lvm_vg_rename - storage_lvm_thinpool_rename - network_vlan - image_create_aliases - container_stateless_copy - container_only_migration - storage_zfs_clone_copy - unix_device_rename - storage_lvm_use_thinpool - storage_rsync_bwlimit - network_vxlan_interface - storage_btrfs_mount_options - entity_description - image_force_refresh - storage_lvm_lv_resizing - id_map_base - file_symlinks - container_push_target - network_vlan_physical - storage_images_delete - container_edit_metadata - container_snapshot_stateful_migration - storage_driver_ceph - storage_ceph_user_name - resource_limits - storage_volatile_initial_source - storage_ceph_force_osd_reuse - storage_block_filesystem_btrfs - resources - kernel_limits - storage_api_volume_rename - network_sriov - console - restrict_dev_incus - migration_pre_copy - infiniband - dev_incus_events - proxy - network_dhcp_gateway - file_get_symlink - network_leases - unix_device_hotplug - storage_api_local_volume_handling - operation_description - clustering - event_lifecycle - storage_api_remote_volume_handling - nvidia_runtime - container_mount_propagation - container_backup - dev_incus_images - container_local_cross_pool_handling - proxy_unix - proxy_udp - clustering_join - proxy_tcp_udp_multi_port_handling - network_state - proxy_unix_dac_properties - container_protection_delete - unix_priv_drop - pprof_http - proxy_haproxy_protocol - network_hwaddr - proxy_nat - network_nat_order - container_full - backup_compression - nvidia_runtime_config - storage_api_volume_snapshots - storage_unmapped - projects - network_vxlan_ttl - container_incremental_copy - usb_optional_vendorid - snapshot_scheduling - snapshot_schedule_aliases - container_copy_project - clustering_server_address - clustering_image_replication - container_protection_shift - snapshot_expiry - container_backup_override_pool - snapshot_expiry_creation - network_leases_location - resources_cpu_socket - resources_gpu - resources_numa - kernel_features - id_map_current - event_location - storage_api_remote_volume_snapshots - network_nat_address - container_nic_routes - cluster_internal_copy - seccomp_notify - lxc_features - container_nic_ipvlan - network_vlan_sriov - storage_cephfs - container_nic_ipfilter - resources_v2 - container_exec_user_group_cwd - container_syscall_intercept - container_disk_shift - storage_shifted - resources_infiniband - daemon_storage - instances - image_types - resources_disk_sata - clustering_roles - images_expiry - resources_network_firmware - backup_compression_algorithm - ceph_data_pool_name - container_syscall_intercept_mount - compression_squashfs - container_raw_mount - container_nic_routed - container_syscall_intercept_mount_fuse - container_disk_ceph - virtual-machines - image_profiles - clustering_architecture - resources_disk_id - storage_lvm_stripes - vm_boot_priority - unix_hotplug_devices - api_filtering - instance_nic_network - clustering_sizing - firewall_driver - projects_limits - container_syscall_intercept_hugetlbfs - limits_hugepages - container_nic_routed_gateway - projects_restrictions - custom_volume_snapshot_expiry - volume_snapshot_scheduling - trust_ca_certificates - snapshot_disk_usage - clustering_edit_roles - container_nic_routed_host_address - container_nic_ipvlan_gateway - resources_usb_pci - resources_cpu_threads_numa - resources_cpu_core_die - api_os - container_nic_routed_host_table - container_nic_ipvlan_host_table - container_nic_ipvlan_mode - resources_system - images_push_relay - network_dns_search - container_nic_routed_limits - instance_nic_bridged_vlan - network_state_bond_bridge - usedby_consistency - custom_block_volumes - clustering_failure_domains - resources_gpu_mdev - console_vga_type - projects_limits_disk - network_type_macvlan - network_type_sriov - container_syscall_intercept_bpf_devices - network_type_ovn - projects_networks - projects_networks_restricted_uplinks - custom_volume_backup - backup_override_name - storage_rsync_compression - network_type_physical - network_ovn_external_subnets - network_ovn_nat - network_ovn_external_routes_remove - tpm_device_type - storage_zfs_clone_copy_rebase - gpu_mdev - resources_pci_iommu - resources_network_usb - resources_disk_address - network_physical_ovn_ingress_mode - network_ovn_dhcp - network_physical_routes_anycast - projects_limits_instances - network_state_vlan - instance_nic_bridged_port_isolation - instance_bulk_state_change - network_gvrp - instance_pool_move - gpu_sriov - pci_device_type - storage_volume_state - network_acl - migration_stateful - disk_state_quota - storage_ceph_features - projects_compression - projects_images_remote_cache_expiry - certificate_project - network_ovn_acl - projects_images_auto_update - projects_restricted_cluster_target - images_default_architecture - network_ovn_acl_defaults - gpu_mig - project_usage - network_bridge_acl - warnings - projects_restricted_backups_and_snapshots - clustering_join_token - clustering_description - server_trusted_proxy - clustering_update_cert - storage_api_project - server_instance_driver_operational - server_supported_storage_drivers - event_lifecycle_requestor_address - resources_gpu_usb - clustering_evacuation - network_ovn_nat_address - network_bgp - network_forward - custom_volume_refresh - network_counters_errors_dropped - metrics - image_source_project - clustering_config - network_peer - linux_sysctl - network_dns - ovn_nic_acceleration - certificate_self_renewal - instance_project_move - storage_volume_project_move - cloud_init - network_dns_nat - database_leader - instance_all_projects - clustering_groups - ceph_rbd_du - instance_get_full - qemu_metrics - gpu_mig_uuid - event_project - clustering_evacuation_live - instance_allow_inconsistent_copy - network_state_ovn - storage_volume_api_filtering - image_restrictions - storage_zfs_export - network_dns_records - storage_zfs_reserve_space - network_acl_log - storage_zfs_blocksize - metrics_cpu_seconds - instance_snapshot_never - certificate_token - instance_nic_routed_neighbor_probe - event_hub - agent_nic_config - projects_restricted_intercept - metrics_authentication - images_target_project - images_all_projects - cluster_migration_inconsistent_copy - cluster_ovn_chassis - container_syscall_intercept_sched_setscheduler - storage_lvm_thinpool_metadata_size - storage_volume_state_total - instance_file_head - instances_nic_host_name - image_copy_profile - container_syscall_intercept_sysinfo - clustering_evacuation_mode - resources_pci_vpd - qemu_raw_conf - storage_cephfs_fscache - network_load_balancer - vsock_api - instance_ready_state - network_bgp_holdtime - storage_volumes_all_projects - metrics_memory_oom_total - storage_buckets - storage_buckets_create_credentials - metrics_cpu_effective_total - projects_networks_restricted_access - storage_buckets_local - loki - acme - internal_metrics - cluster_join_token_expiry - remote_token_expiry - init_preseed - storage_volumes_created_at - cpu_hotplug - projects_networks_zones - network_txqueuelen - cluster_member_state - instances_placement_scriptlet - storage_pool_source_wipe - zfs_block_mode - instance_generation_id - disk_io_cache - amd_sev - storage_pool_loop_resize - migration_vm_live - ovn_nic_nesting - oidc - network_ovn_l3only - ovn_nic_acceleration_vdpa - cluster_healing - instances_state_total - auth_user - security_csm - instances_rebuild - numa_cpu_placement - custom_volume_iso - network_allocations - zfs_delegate - storage_api_remote_volume_snapshot_copy - operations_get_query_all_projects - metadata_configuration - syslog_socket - event_lifecycle_name_and_project - instances_nic_limits_priority - disk_initial_volume_configuration - operation_wait - image_restriction_privileged - cluster_internal_custom_volume_copy - disk_io_bus - storage_cephfs_create_missing - instance_move_config - ovn_ssl_config - certificate_description - disk_io_bus_virtio_blk - loki_config_instance - instance_create_start - clustering_evacuation_stop_options - boot_host_shutdown_action - agent_config_drive - network_state_ovn_lr - image_template_permissions - storage_bucket_backup - storage_lvm_cluster - shared_custom_block_volumes - auth_tls_jwt - oidc_claim - device_usb_serial - numa_cpu_balanced - image_restriction_nesting - network_integrations - instance_memory_swap_bytes - network_bridge_external_create - network_zones_all_projects - storage_zfs_vdev - container_migration_stateful - profiles_all_projects - instances_scriptlet_get_instances - instances_scriptlet_get_cluster_members - instances_scriptlet_get_project - network_acl_stateless - instance_state_started_at - networks_all_projects - network_acls_all_projects - storage_buckets_all_projects - resources_load - instance_access - project_access - projects_force_delete - resources_cpu_flags - disk_io_bus_cache_filesystem - instance_oci - clustering_groups_config - instances_lxcfs_per_instance - clustering_groups_vm_cpu_definition - disk_volume_subpath - projects_limits_disk_pool - network_ovn_isolated - qemu_raw_qmp - network_load_balancer_health_check - oidc_scopes - network_integrations_peer_name - qemu_scriptlet - instance_auto_restart - storage_lvm_metadatasize - ovn_nic_promiscuous - ovn_nic_ip_address_none - instances_state_os_info - network_load_balancer_state - instance_nic_macvlan_mode - storage_lvm_cluster_create - network_ovn_external_interfaces - instances_scriptlet_get_instances_count - cluster_rebalance - custom_volume_refresh_exclude_older_snapshots - storage_initial_owner - storage_live_migration - instance_console_screenshot - image_import_alias - authorization_scriptlet - console_force - network_ovn_state_addresses - network_bridge_acl_devices - instance_debug_memory - init_preseed_storage_volumes - init_preseed_profile_project - instance_nic_routed_host_address - instance_smbios11 - api_filtering_extended - acme_dns01 - security_iommu - network_ipv4_dhcp_routes - network_state_ovn_ls - network_dns_nameservers - acme_http01_port - network_ovn_ipv4_dhcp_expiry - instance_state_cpu_time - network_io_bus - disk_io_bus_usb - storage_driver_linstor - instance_oci_entrypoint - network_address_set - server_logging - network_forward_snat - memory_hotplug - instance_nic_routed_host_tables - instance_publish_split - init_preseed_certificates - custom_volume_sftp - network_ovn_external_nic_address - network_physical_gateway_hwaddr - backup_s3_upload - snapshot_manual_expiry - resources_cpu_address_sizes - disk_attached - limits_memory_hotplug - disk_wwn - server_logging_webhook - storage_driver_truenas - container_disk_tmpfs - instance_limits_oom - backup_override_config - network_ovn_tunnels - init_preseed_cluster_groups - usb_attached - backup_iso - instance_systemd_credentials - cluster_group_usedby - bpf_token_delegation - file_storage_volume - network_hwaddr_pattern api_status: stable api_version: "1.0" auth: trusted public: false auth_methods: - tls auth_user_name: yala auth_user_method: unix environment: addresses: [] architectures: - x86_64 - i686 certificate: | -----BEGIN CERTIFICATE----- … -----END CERTIFICATE----- certificate_fingerprint: … driver: lxc | qemu driver_version: 6.0.5 | 10.1.2 firewall: nftables kernel: Linux kernel_architecture: x86_64 kernel_features: idmapped_mounts: "true" netnsid_getifaddrs: "true" seccomp_listener: "true" seccomp_listener_continue: "true" uevent_injection: "true" unpriv_binfmt: "true" unpriv_fscaps: "true" kernel_version: 6.17.0-6-generic lxc_features: cgroup2: "true" core_scheduling: "true" devpts_fd: "true" idmapped_mounts_v2: "true" mount_injection_file: "true" network_gateway_device_route: "true" network_ipvlan: "true" network_l2proxy: "true" network_phys_macvlan_mtu: "true" network_veth_router: "true" pidfd: "true" seccomp_allow_deny_syntax: "true" seccomp_notify: "true" seccomp_proxy_send_notify_fd: "true" os_name: Ubuntu os_version: "25.10" project: default server: incus server_clustered: false server_event_mode: full-mesh server_name: ganglion server_pid: 15328 server_version: "6.18" storage: dir | zfs storage_version: 1 | 2.3.4-1ubuntu2 storage_supported_drivers: - name: dir version: "1" remote: false - name: lvm version: 2.03.31(2) (2025-02-27) / 1.02.205 (2025-02-27) / 4.50.0 remote: false - name: lvmcluster version: 2.03.31(2) (2025-02-27) / 1.02.205 (2025-02-27) / 4.50.0 remote: true - name: truenas version: 0.7.3 remote: true - name: zfs version: 2.3.4-1ubuntu2 remote: false - name: btrfs version: "6.16" remote: false ``` ### Instance details N/A ### Instance log N/A ### Current behavior When launching Incus on a Ubuntu 25.10 host that has Ubuntu guests with Incus which use 1. ZFS delegation 2. provide a custom, during boot still locked `encryptionroot` in a `canmount=off` dataset, e.g. `rpool/ROOT/ubuntu_d4psvq/var/incus/containers/u1/USERDATA` 3. and within that have a `canmount=noauto` dataset for an Incus configuration state at `/var/lib/incus`, `rpool/ROOT/ubuntu_d4psvq/var/incus/containers/u1/USERDATA/incus` then Incus will not start during boot. Starting it manually forces the decryption of the `encryptionroot`, else that is aborted. In case one proceeds supplying the key material, still during launch, it will recursively mount that dataset _on the host system_, including the Incus configuration store `/var/lib/incus`. After launch, Incus will run the Incus instance from inside the container _on the host_. All storage pool configuration remains intact, due to the absolute paths of datasets being the same outside and within guests. While stopping Incus works, unmounting the dataset does not. The Incus instance from the host remains inaccessible. ### Expected behavior 1. Incus starts without forcing the decryption of `encryptionroot`s in storage volumes from instances that are provided with ZFS delegation. 2. Datasets below the storage volume of an instance are not mounted on the host system. `canmount=noauto` is respected. 3. Decryption of the encrypted dataset below the dataset of the storage volume of the instance can happen from within the guest and is not enforced beforehand. ### Steps to reproduce This is a very rare edge case, where a serias of conditions have to be present. - An Ubuntu Incus root host using ZFS - using a separate dataset for `/var/lib/incus`. - A ZFS-delegated Ubuntu instance with an Incus host, that - brings its own `encryptionroot` within the instance's storage volume - below that uses a separate dataset for `/var/lib/incus` ```console NAME USED AVAIL REFER MOUNTPOINT rpool/ROOT/ubuntu_d4psvq/var/incus 2.93G 97.2G 96K legacy rpool/ROOT/ubuntu_d4psvq/var/incus/buckets 288K 97.2G 96K legacy rpool/ROOT/ubuntu_d4psvq/var/incus/buckets/default_yala 192K 97.2G 192K legacy rpool/ROOT/ubuntu_d4psvq/var/incus/containers 1.94G 97.2G 96K legacy rpool/ROOT/ubuntu_d4psvq/var/incus/containers/u0 61.8M 97.2G 355M legacy rpool/ROOT/ubuntu_d4psvq/var/incus/containers/u1 1.88G 97.2G 1.21G legacy rpool/ROOT/ubuntu_d4psvq/var/incus/containers/u1/ROOT 372K 97.2G 228K /srv/ROOT rpool/ROOT/ubuntu_d4psvq/var/incus/containers/u1/USERDATA 976M 97.2G 192K legacy rpool/ROOT/ubuntu_d4psvq/var/incus/containers/u1/USERDATA/incus 975M 97.2G 243M /var/lib/incus rpool/ROOT/ubuntu_d4psvq/var/incus/containers/u1/USERDATA/incus/storage-pools 732M 97.2G 496K /var/lib/incus/storage-pools rpool/ROOT/ubuntu_d4psvq/var/incus/containers/u1/USERDATA/incus/storage-pools/u1 732M 97.2G 192K legacy rpool/ROOT/ubuntu_d4psvq/var/incus/containers/u1/USERDATA/incus/storage-pools/u1/buckets 192K 97.2G 192K legacy rpool/ROOT/ubuntu_d4psvq/var/incus/containers/u1/USERDATA/incus/storage-pools/u1/containers 192K 97.2G 192K legacy rpool/ROOT/ubuntu_d4psvq/var/incus/containers/u1/USERDATA/incus/storage-pools/u1/custom 192K 97.2G 192K legacy rpool/ROOT/ubuntu_d4psvq/var/incus/containers/u1/USERDATA/incus/storage-pools/u1/deleted 1.12M 97.2G 192K legacy rpool/ROOT/ubuntu_d4psvq/var/incus/containers/u1/USERDATA/incus/storage-pools/u1/deleted/buckets 192K 97.2G 192K legacy rpool/ROOT/ubuntu_d4psvq/var/incus/containers/u1/USERDATA/incus/storage-pools/u1/deleted/containers 192K 97.2G 192K legacy rpool/ROOT/ubuntu_d4psvq/var/incus/containers/u1/USERDATA/incus/storage-pools/u1/deleted/custom 192K 97.2G 192K legacy rpool/ROOT/ubuntu_d4psvq/var/incus/containers/u1/USERDATA/incus/storage-pools/u1/deleted/images 192K 97.2G 192K legacy rpool/ROOT/ubuntu_d4psvq/var/incus/containers/u1/USERDATA/incus/storage-pools/u1/deleted/virtual-machines 192K 97.2G 192K legacy rpool/ROOT/ubuntu_d4psvq/var/incus/containers/u1/USERDATA/incus/storage-pools/u1/images 730M 97.2G 192K legacy rpool/ROOT/ubuntu_d4psvq/var/incus/containers/u1/USERDATA/incus/storage-pools/u1/images/8c0bedc41b796d32d0f80d0b7b40be8592e1389bd1a92d4fa0c4acee9d907854 373M 97.2G 373M legacy rpool/ROOT/ubuntu_d4psvq/var/incus/containers/u1/USERDATA/incus/storage-pools/u1/images/d753ba4a12fb3f773e41e4964973964ea2c7441add6b3155582ca2786fb40cc8 357M 97.2G 357M legacy rpool/ROOT/ubuntu_d4psvq/var/incus/containers/u1/USERDATA/incus/storage-pools/u1/virtual-machines 192K 97.2G 192K legacy rpool/ROOT/ubuntu_d4psvq/var/incus/custom 192K 97.2G 96K legacy rpool/ROOT/ubuntu_d4psvq/var/incus/custom/default_u0-data 96K 97.2G 96K legacy rpool/ROOT/ubuntu_d4psvq/var/incus/deleted 301M 97.2G 96K legacy rpool/ROOT/ubuntu_d4psvq/var/incus/deleted/buckets 96K 97.2G 96K legacy rpool/ROOT/ubuntu_d4psvq/var/incus/deleted/containers 96K 97.2G 96K legacy rpool/ROOT/ubuntu_d4psvq/var/incus/deleted/custom 96K 97.2G 96K legacy rpool/ROOT/ubuntu_d4psvq/var/incus/deleted/images 300M 97.2G 96K legacy rpool/ROOT/ubuntu_d4psvq/var/incus/deleted/images/73fe6edfe11e491e208df76f01cbc7303c8ad203a40d0bae2c9a7ad1cfe4e750 300M 97.2G 300M legacy rpool/ROOT/ubuntu_d4psvq/var/incus/deleted/virtual-machines 96K 97.2G 96K legacy rpool/ROOT/ubuntu_d4psvq/var/incus/images 710M 97.2G 96K legacy rpool/ROOT/ubuntu_d4psvq/var/incus/images/058f298087312e0b7bd685520fb66b9f520bf2123bec237da64b79b04a5363ea 51.4M 97.2G 51.4M legacy rpool/ROOT/ubuntu_d4psvq/var/incus/images/5f69e85b1d16b6f0b5b4ba87a7020ae56463a32138345656b0ca78d74ac96a70 329M 97.2G 329M legacy rpool/ROOT/ubuntu_d4psvq/var/incus/images/8c0bedc41b796d32d0f80d0b7b40be8592e1389bd1a92d4fa0c4acee9d907854 329M 97.2G 329M legacy rpool/ROOT/ubuntu_d4psvq/var/incus/virtual-machines 96K 97.2G 96K legacy ``` ```sh $ sudo zfs list -o name,canmount,mountpoint,mounted,encryptionroot rpool/ROOT/ubuntu_d4psvq/var/incus{,/containers/u1,/containers/u1/USERDATA,/containers/u1/USERDATA/incus} NAME CANMOUNT MOUNTPOINT MOUNTED ENCROOT rpool/ROOT/ubuntu_d4psvq/var/incus on legacy no - rpool/ROOT/ubuntu_d4psvq/var/incus/containers/u1 noauto legacy no - rpool/ROOT/ubuntu_d4psvq/var/incus/containers/u1/USERDATA off legacy no rpool/ROOT/ubuntu_d4psvq/var/incus/containers/u1/USERDATA rpool/ROOT/ubuntu_d4psvq/var/incus/containers/u1/USERDATA/incus noauto /var/lib/incus yes rpool/ROOT/ubuntu_d4psvq/var/incus/containers/u1/USERDATA ``` Before the reboot, these were regular ZFS mounts, where allowed. In this scenario, starting Incus on the host system will look like: ```sh $ incus version Client Version: 6.18 Server version: nicht erreichbar $ sudo systemctl start incus.service incus.socket 🔐 Enter passphrase for rpool/ROOT/ubuntu_d4psvq/var/incus/containers/u1/USERDATA: •••••••• ``` This is `zfs-load-key@rpool-ROOT-ubuntu_d4psvq-var-incus-containers-u1-USERDATA.service` speaking. While initially being `failed` after boot, it then switches to `active`. It is not directly referenced from the `incus.service` unit. No respective mount units are created, but it could explain the side-effects on the host system. - https://github.com/openzfs/zfs/issues/17404 1. Which process enforced the start of the `zfs-load-key@` unit? 2. Which process mounted the `/var/lib/incus` dataset from within the container, in presence of `canmount=noauto` and in absence of a mount unit? - https://github.com/openzfs/zfs/issues/15884 The startup continues as expected, but surprisingly `incus list` does not show the `u1` instance, but `incus storage list` shows the `u1` storage pool from the Incus host inside the `u1` instance. ```sh $ incus ls +------+-------+------+------+------+-----------+ | NAME | STATE | IPV4 | IPV6 | TYPE | SNAPSHOTS | +------+-------+------+------+------+-----------+ $ incus storage ls +---------+--------+--------------+---------+---------+ | NAME | DRIVER | BESCHREIBUNG | USED BY | STATE | +---------+--------+--------------+---------+---------+ | default | dir | | 1 | CREATED | +---------+--------+--------------+---------+---------+ | u1 | zfs | | 2 | CREATED | +---------+--------+--------------+---------+---------+ $ incus storage show u1 | yq '@json' | jq '{config, driver}' | yq -P config: source: rpool/ROOT/ubuntu_d4psvq/var/incus/containers/u1/USERDATA/incus/storage-pools/u1 volatile.initial_source: rpool/ROOT/ubuntu_d4psvq/var/incus/containers/u1/USERDATA/incus/storage-pools/u1 zfs.pool_name: rpool/ROOT/ubuntu_d4psvq/var/incus/containers/u1/USERDATA/incus/storage-pools/u1 driver: zfs ``` It is not possible to unmount the unintended `/var/lib/incus` mount. ```sh $ sudo systemctl stop incus.service incus.socket $ df -h /var/lib/incus/ Dateisystem Größe Benutzt Verf. Verw% Eingehängt auf rpool/ROOT/ubuntu_d4psvq/var/incus/containers/u1/USERDATA/incus 98G 244M 98G 1% /var/lib/incus $ sudo zfs umount rpool/ROOT/ubuntu_d4psvq/var/incus/containers/u1/USERDATA/incus cannot unmount '/var/lib/incus': pool or dataset is busy $ sudo umount /var/lib/incus umount: /var/lib/incus: das Ziel wird gerade benutzt. ``` A reboot is required to remove the unintended mount. It is not possible to start the Incus instance from the host, as unlocking and mounting the `canmount=noauto` dataset on the host is forced. Masking the `zfs-load-key@` unit appears as a chance of remediation that will be tested.

yala · November 10, 2025, 8:49pm

The origin of the fork just closed a related feature request: