Lxd-to-incus migration failed "No root device could be found"

I was trying to run lxd-to-incus tool (running lxd snap on debian 11) and it failed while moving instances

root@winterfell:/etc/ssh# lxd-to-incus
=> Looking for source server
==> Detected: snap package
=> Looking for target server
==> Detected: systemd
=> Connecting to source server
=> Connecting to the target server
=> Checking server versions
==> Source version: 5.19
==> Target version: 0.4
=> Validating version compatibility
=> Checking that the source server isn't empty
=> Checking that the target server is empty
=> Validating source server configuration

The migration is now ready to proceed.
At this point, the source server and all its instances will be stopped.
Instances will come back online once the migration is complete.

Proceed with the migration? [default=no]: yes
=> Stopping the source server
=> Stopping the target server
=> Wiping the target server
=> Migrating the data
Error: Failed to move "/var/snap/lxd/common/lxd/" to "/var/lib/incus/": Failed to run: mv /var/snap/lxd/common/lxd/ /var/lib/incus/: exit status 1 (mv: cannot remove '/var/snap/lxd/common/lxd/storage-pools/default/images/d79756b3f0262191c5296b068d260c9d6375797777ba3bda06a00846e42fadbd': Device or resource busy
mv: cannot remove '/var/snap/lxd/common/lxd/storage-pools/default/images/4b6fa882f40cd22541d37617f21e6f2a196b1bfdd6bec5b723ac00cbe0f87c81': Device or resource busy
mv: cannot remove '/var/snap/lxd/common/lxd/storage-pools/default/images/bbd00b4cf7784f7081ec704269c4da5908f80ce4e9cbc3deac32f9cae5ff0bb0': Device or resource busy)

I never experienced this error, it seems like zfs error connected to images dataset or snapshots…

after that I inspected /var/lib/incus and most of the content seemed copied.

So tried restart incus.service and incus.socket. After solving LZ4 error (found here [Lxd-to-incus migration did not succeed, and I appear to have lost my instances] incus have started.

incus list 

gives me all the instances, but they cannot start

root@winterfell:/var/lib/incus# incus list
+----------------+---------+------+------+-----------+-----------+
|      NAME      |  STATE  | IPV4 | IPV6 |   TYPE    | SNAPSHOTS |
+----------------+---------+------+------+-----------+-----------+
| calibre-server | STOPPED |      |      | CONTAINER | 3         |
+----------------+---------+------+------+-----------+-----------+
| calibre-web    | STOPPED |      |      | CONTAINER | 0         |


root@winterfell:/var/lib/incus# incus start calibre-web
Error: Unable to resolve container rootfs: lstat /var/snap/lxd/common/lxd/storage-pools/default/containers: no such file or directory
Try `incus info --show-log calibre-web` for more info

when i check /var/lib/incus containers it seems it points still to old lxd

root@winterfell:/var/lib/incus#  ls -lh /var/lib/incus/containers | grep calibre
lrwxrwxrwx 1 root root 72 Jan 27  2023 calibre-server -> /var/snap/lxd/common/lxd/storage-pools/default/containers/calibre-server
lrwxrwxrwx 1 root root 69 Jan 31  2023 calibre-web -> /var/snap/lxd/common/lxd/storage-pools/default/containers/calibre-web

root@winterfell:/var/lib/incus# incus storage list
+---------+--------+----------+-------------+---------+---------+
|  NAME   | DRIVER |  SOURCE  | DESCRIPTION | USED BY |  STATE  |
+---------+--------+----------+-------------+---------+---------+
| default | zfs    | cnt/lxd  |             | 29      | CREATED |
+---------+--------+----------+-------------+---------+---------+
| dock    | btrfs  | /dev/zd0 |             | 1       | CREATED |
+---------+--------+----------+-------------+---------+---------+

any ideas how to fix would be much appreciated… thanks a lot.

Hmm, that’s an odd failure mode, I’ll look into those error more to see if there’s something we can do to better handle that.

Fixing your system shouldn’t be too hard though.
The main thing you’ll need to do is go through:

  • /var/lib/incus/containers/
  • /var/lib/incus/containers-snapshots/
  • /var/lib/incus/virtual-machines/
  • /var/lib/incus/virtual-machines-snapshots/

Then you’ll need to correct the symlinks to point to a valid target.

In your example above, that means that calibre-server needs to point to /var/lib/incus/storage-pools/default/containers/calibre-server

thanks for anwer, changed the symlinks and instances are working!

i tried to check that “Device or resource busy” seems to be connected to some ZFS issue,

while checking zfs pool (cnt/lxd dataset is configured as default storage pool) images that refused to move were mounted to other mountpoints than legacy

root@winterfell:/etc/ssh# sudo zfs list cnt -r -d 3
NAME                                                                              USED  AVAIL     REFER  MOUNTPOINT
cnt                                                                              80.6G   144G       96K  /cnt
cnt/docker                                                                       32.2G   167G     8.84G  -
cnt/lxd                                                                          48.3G   144G      192K  legacy
cnt/lxd/buckets                                                                   192K   144G      192K  legacy
cnt/lxd/containers                                                               42.7G   144G      192K  legacy
cnt/lxd/containers/bazarr4k                                                       632M   144G      790M  legacy
cnt/lxd/containers/calibre-server                                                1.09G   144G     1.01G  legacy
cnt/lxd/containers/calibre-web                                                    822M   144G      842M  legacy
cnt/lxd/custom                                                                    168K   144G      168K  legacy
cnt/lxd/custom-snapshots                                                          168K   144G      168K  none
cnt/lxd/deleted                                                                  4.79G   144G      168K  legacy
cnt/lxd/deleted/buckets                                                           192K   144G      192K  legacy
cnt/lxd/deleted/containers                                                        168K   144G      168K  legacy
cnt/lxd/deleted/custom                                                            168K   144G      168K  legacy
cnt/lxd/deleted/images                                                           4.79G   144G      168K  legacy
cnt/lxd/deleted/virtual-machines                                                  168K   144G      168K  legacy
cnt/lxd/images                                                                    875M   144G      168K  legacy
cnt/lxd/images/3c138ae7eead4af5b5c7f3c5ce29758587eb6216849ad936981e7b0a6e791b64   217M   144G      217M  legacy
cnt/lxd/images/4b6fa882f40cd22541d37617f21e6f2a196b1bfdd6bec5b723ac00cbe0f87c81   187M   144G      192K  /var/snap/lxd/common/lxd/storage-pools/default/images/4b6fa882f40cd22541d37617f21e6f2a196b1bfdd6bec5b723ac00cbe0f87c81
cnt/lxd/images/bbd00b4cf7784f7081ec704269c4da5908f80ce4e9cbc3deac32f9cae5ff0bb0   206M   144G      192K  /var/snap/lxd/common/lxd/storage-pools/default/images/bbd00b4cf7784f7081ec704269c4da5908f80ce4e9cbc3deac32f9cae5ff0bb0
cnt/lxd/images/d79756b3f0262191c5296b068d260c9d6375797777ba3bda06a00846e42fadbd   264M   144G      192K  /var/snap/lxd/common/lxd/storage-pools/default/images/d79756b3f0262191c5296b068d260c9d6375797777ba3bda06a00846e42fadbd
cnt/lxd/snapshots                                                                 360K   144G      192K  none
cnt/lxd/snapshots/piwigo                                                          168K   144G      168K  none
cnt/lxd/virtual-machines                                                          192K   144G      192K  legacy

it took a restart and then unmounting the dataset via zfs

sudo zfs unmount /var/snap/lxd/common/lxd/storage-pools/default/images/ .... 

and then finally I was able to purge lxd snap and finally uninstall snapd.

So maybe that mountpoint situation is something that prevented lxd-to-incus work correctly.

Also a sidenote, /var/snap was mounted zfs dataset rpool/var/snap (but no child datasets or snapshots exited at the time of migration) if it makes any difference for lxd-to-incus tool…

After I purged snap and lxd I cannot start instance the reason is missing shmounts maybe?

root@winterfell:/var/lib/incus# incus start calibre-web
Error: Daemon failed to setup shared mounts base. Does security.nesting need to be turned on?
root@winterfell:/var/lib/incus# ls -lah shmounts -R
shmounts:
total 34K
drwxr-xr-x  2 root root  2 Jan  4 12:01 .
drwx--x--x 17 root root 21 Jan  4 12:01 ..
root@winterfell:/var/lib/incus#

there was a symlink to /var/snap/lxd/common/lxd instead of shmounts directory so I rm that and created shmounts directory but no idea how to recreate those…

Thanks for help :slight_smile:

Yeah, that’s a pretty weird one as the images should normally never be mounted by LXD.
They get unpacked after download, then a @readonly snapshot is made and after that, the content of the image should never be accessed and the image therefore never be mounted (just used as a clone source).

Can you show ls -lh /var/lib/incus/?

For the shmounts specifically, you can just delete it and restart incus, that will have it re-created.

lxd-to-incus would actually have deleted:

  • devices
  • devlxd
  • security
  • shmounts

So you may want to do the same, then restart incus and see if things behave.

Many thanks it works now, after restart and deleting directories you advised.

Thanks a lot for all the work! Now it remains to get my new vlan network config working, but thats for another topic :slight_smile: