For the “local” pool, I’m confused why the WD Black device is listed without -part11 in your pool state. I’ve just re-run through the tutorial referenced above and both my pool devices have the expected -part11 ending. The “local” pool is a bit special, and if you’ve managed to somehow get to your current state there’s probably some edge case that IncusOS needs to properly handle so the old, degraded device is automatically cleaned up.
The failed drive wasn’t the main drive. Back in February, i expanded the pool with the now degraded KINGSTON_SNVS1000G_50026B7685313913 to a raid1. No reinstall needed as the main drive works (the other working Kingston drive)!
I used the linked tutorial back when i installed this host. For the replacement i just ran this single edit command with a completely new WD Black. After that the scrub started and finished successfully as seen in the state.
I could try to reproduce this on a virtual IncusOS tomorrow and write down all my steps i did.
Got it figured out – IncusOS wasn’t properly handling degraded devices that were physically missing, and was instead attempting to extend the existing storage pool instead of replacing the degraded device. I’ve got a fix here: Handle missing degraded storage devices by gibmat · Pull Request #1025 · lxc/incus-os · GitHub . Once that lands in an IncusOS release this particular issue shouldn’t happen again.
To fix up your existing IncusOS system, the easiest way will be to temporarily boot a live Linux environment with ZFS support; you’ll likely need to disable SecureBoot temporarily first, but don’t wipe the SecureBoot keys. Then, you’ll want to remove the second drive from the local zpool by hand. Because the zpool is already a RAID1 mirror, it should be able to tolerate the removal without data loss.
zpool offline local nvme-WD_BLACK_SN8100_1000GB_25413E800328
zpool detach local nvme-WD_BLACK_SN8100_1000GB_25413E800328
zpool status local
After confirming ZFS isn’t warning about data loss, wipe the replacement drive.
Re-enable SecureBoot and boot back into IncusOS. At this point you should essentially be back at the point of being ready to replace the failed drive with the new one. Wait for an updated IncusOS release to have the fix, apply the update, reboot, and then replace your failed drive. Hopefully it should all work as expected.