After creating a second zfs pool and re-creating my containers on the new pool, everything seemed to go smoothly. I then exported/re-imported my images and deleted the old pool. So far so good.
Now, when I try to create a new container from the imported image, the operation fails due to a zfs error, as per below:
t=2019-02-11T16:25:23+0000 lvl=info msg="Creating container" ephemeral=false name=new_container
t=2019-02-11T16:25:24+0000 lvl=info msg="Created container" ephemeral=false name=new_container
t=2019-02-11T16:25:57+0000 lvl=eror msg="zfs rename failed: umount: /var/lib/lxd/storage-pools/pool00/containers/random_container: target is busy\n (In some cases useful info about processes that\n use the device is found by lsof(8) or fuser(1).)\ncannot unmount '/var/lib/lxd/storage-pools/pool00/containers/random_container': umount failed\n"
t=2019-02-11T16:25:57+0000 lvl=info msg="Deleting container" created=2019-02-11T16:25:23+0000 ephemeral=false name=new_container used=1970-01-01T00:00:00+0000
t=2019-02-11T16:25:57+0000 lvl=info msg="Deleted container" created=2019-02-11T16:25:23+0000 ephemeral=false name=new_container used=1970-01-01T00:00:00+0000
If i shutdown the referenced “random_container”, another one is reported.
I am buffled as to why zfs would be trying to rename an existing dataset, let alone why it fails.
Ok, you’d indeed need the 4.15 kernel from bionic (hwe kernel) to have the kernel change that helps with this.
Can you show zfs list -t all?
Your symptoms sound like your have the right image in deleted/images and that LXD is attempting to move it back to images, hitting that annoying ZFS behavior…
If that’s the case, you pretty much have two solutions:
Stop all containers, then launch, then restart all containers, that should take care of that image for good
Upgrade to the HWE kernel (4.15) and reboot, which should then help ZFS deal with that weird behavior
So, bad news. Restarted on kernel 4.15.0-45 and still same behavior.
If I try to destroy the dataset after deleting the image, I get:
zfs destroy pool00/deleted/images/db29f6f1afde60a2d86ff554712baa0a92fbc4bed2f2ff2910213e4e8bde4bde
cannot destroy 'pool00/deleted/images/db29f6f1afde60a2d86ff554712baa0a92fbc4bed2f2ff2910213e4e8bde4bde': filesystem has children
use '-r' to destroy the following datasets:
pool00/deleted/images/db29f6f1afde60a2d86ff554712baa0a92fbc4bed2f2ff2910213e4e8bde4bde@readonly