Unable to migrate LXD 5.21 with microceph to Incus 6.0

tregubovav · April 17, 2024, 1:53am

I’m planning to migrate my LXD cluster with microceph storage to Incus stable (6.0).
My cluster built on top of 7x RPI4 SBCs with Ubuntu 23.10 as host OS.
Before migrate this cluster I tried to evaluate migration process with virtual lab built with 3 Ubuntu 23.10 VMs which is partially emulate my production cluster. However, I was not able to complete migration due to migration fail. lxd-to-incus process completes successfully on slave nodes but fails on primary node then with the following error:

Error: Failed to restore "vm-01": Failed to start instance "dns-01": Failed to run: rbd --id admin --cluster ceph --pool lxd map container_infra_dns-01: exit status 1 (rbd: warning: can't get image map information: (13) Permission denied
rbd: failed to add secret 'client.admin' to kernel
rbd: map failed: (1) Operation not permitted)

All nodes in Incus cluster remain in “EVACUATED” state after that.

Test environment configuration:

3x Hyper-V VMs with 4 GB RAM; each VM has dedicated virtual disk for ceph cluster.
Software: Ubuntu 23.10; Snaps: LXD 5.21.1-43998c6; 18.2.0+snap450240f5dd (reef/stable)
ceph cluster: 3x osd (3x 8GB); each node run osd, monitor, mgr and mds services
LXD cluster consists of two storages: ceph and cephfs (used for shared volumes); three projects and eight (8) containers.
Incus: incus/jammy,now 6.0-202404040304-ubuntu22.04 amd64

Below are lxd-to-incus outputs from each node:

1st node:

sudo lxd-to-incus
=> Looking for source server
==> Detected: snap package
=> Looking for target server
==> Detected: systemd
=> Connecting to source server
=> Connecting to the target server
=> Checking server versions
==> Source version: 5.21.1
==> Target version: 6.0.0
=> Validating version compatibility
=> Checking that the source server isn't empty
=> Checking that the target server is empty
=> Validating source server configuration

The migration is now ready to proceed.

A cluster environment was detected.
Manual action will be needed on each of the server prior to Incus being functional.
The migration will begin by shutting down instances on all servers.

It will then convert the current server over to Incus and then wait for the other servers to be converted.

Do not attempt to manually run this tool on any of the other servers in the cluster.
Instead this tool will be providing specific commands for each of the servers.
Proceed with the migration? [default=no]: yes
=> Stopping all workloads on the cluster
==> Stopping all workloads on server "vm-01"
==> Stopping all workloads on server "vm-02"
==> Stopping all workloads on server "vm-03"
=> Stopping the source server
=> Stopping the target server
=> Wiping the target server
=> Migrating the data
=> Migrating database
=> Writing database patch
=> Running data migration commands
=> Cleaning up target paths
=> Starting the target server
=> Waiting for other cluster servers

Please run `lxd-to-incus --cluster-member` on all other servers in the cluster

The command has been started on all other servers? [default=no]: yes

=> Waiting for cluster to be fully migrated
=> Checking the target server
=> Restoring the cluster
==> Restoring workloads on server "vm-01"
Error: Failed to restore "vm-01": Failed to start instance "dns-01": Failed to run: rbd --id admin --cluster ceph --pool lxd map container_infra_dns-01: exit status 1 (rbd: warning: can't get image map information: (13) Permission denied
rbd: failed to add secret 'client.admin' to kernel
rbd: map failed: (1) Operation not permitted)

2nd node

sudo lxd-to-incus --cluster-member
=> Looking for source server
==> Detected: snap package
=> Looking for target server
==> Detected: systemd
=> Connecting to the target server
=> Stopping the source server
=> Stopping the target server
=> Wiping the target server
=> Migrating the data
=> Migrating database
=> Cleaning up target paths
=> Starting the target server
=> Waiting for cluster to be fully migrated
=> Checking the target server
Uninstall the LXD package? [default=no]:

3rd node

sudo lxd-to-incus --cluster-member
=> Looking for source server
==> Detected: snap package
=> Looking for target server
==> Detected: systemd
=> Connecting to the target server
=> Stopping the source server
=> Stopping the target server
=> Wiping the target server
=> Migrating the data
=> Migrating database
=> Cleaning up target paths
=> Starting the target server
=> Waiting for cluster to be fully migrated
=> Checking the target server
Uninstall the LXD package? [default=no]:

incus cluster list output:
incus cluster list -f compact

  NAME              URL                   ROLES       ARCHITECTURE  FAILURE DOMAIN  DESCRIPTION    STATE               MESSAGE
  vm-01  https://vm-01:8443  database-leader  x86_64        default                      EVACUATED  Unavailable due to maintenance
                                     database
  vm-02  https://vm-02:8443  database         x86_64        default                      EVACUATED  Unavailable due to maintenance
  vm-03  https://vm-03:8443  database         x86_64        default                      EVACUATED  Unavailable due to maintenance

I tried this migration twice building test environments from the scratch twice.

P.S.
For the first sight the issue is related to the microceph bug: microceph.rbd map · Issue #145 · canonical/microceph · GitHub

stgraber · April 17, 2024, 3:11am

Can you try installing the ceph-common package on your systems if you don’t have it already?

Then once that’s there, you’ll want to make sure that the ceph and rbd commands are handled by that package and not microceph, I haven’t used snaps in a little while but there may be a snap unalias ceph and snap unalias rbd command you can use?

Basically the goal is to keep microceph working internally the way it is, but have normal ceph and rbd commands that do not rely on the snap. Those will then read from /etc/ceph, so you’ll need to make sure you have a /etc/ceph/ceph.conf and /etc/ceph/ceph.client.admin.keyring.

You should be able to find the current microceph versions of both of those files somewhere in /var/lib/microceph.

It’s a bit odd that microceph provides the rbd command while being fully aware that rbd map and rbd unmap don’t work due to snap confinement…

stgraber · April 17, 2024, 3:12am

As for the lxd-to-incus run, it failed in the cluster restore stage which is all the way at the end of the migration, so if that cluster wasn’t a throw away test, you’d still be fine. You’d need to get the ceph and rbd commands to behave and then can manually run incus cluster restore NAME for each server and you’ll be done with the migration.

tregubovav · April 17, 2024, 5:14am

Thank you Stéphane for you quick reply and suggestions.
I have mixed results now after trying steps you suggested.

What I did:

restored VMs to snapshot before migration to Incus
installed ceph-common package to all nodes
added links to microceph config files (/var/snap/microceph/current/conf) into /etc/ceph
checked ceph health; rebooted nodes one-by-one to verify that ceph cluster works properly after installing ceph-common package
run lxd-to-incus

Migration was succussed without any issues. All nodes in incus cluster becomes available and all instances run after migration.
Good result ? Yes and No.

There is an issue wen node shutdowns/restarts. incusd can’t be properly shutdown. System kills it after timeout. The system constantly write error messages to console like:

libceph: connect (1) <vm-01 ip>:6789 error -101
libceph: mon1 (1) <vm-01 ip>:6789 error -101
libceph: osd3 (1) <vm-01 ip>:6803 error -101
...

So system wait up to 10 minutes before kill incusd and complete shutdown/restart.

I observed that snap shuts downs microceph daemons before lxd dervices or incus services shut down. This may be root cause of this issue.

stgraber · April 17, 2024, 2:03pm

Yep, looks good.

Indeed sounds like your MicroCeph shouldn’t be shut down before Incus, though at the same time, a properly deployed Ceph with at least 3 monitors and enough OSDs to sustain a host failure should be able to handle this situation fine.

tregubovav · April 17, 2024, 7:09pm

There is an additional update:

microceph populates ceph.client.admin.keyring file only on first node during cluster installation. all nodes contain ceph.keyring with the same. Absence of ceph.client.admin.keyring enforces ceph connectivity issues during single node shutdown when Incus is running:

libceph: connect (1) <vm-01 ip>:6789 error -101
libceph: mon1 (1) <vm-01 ip>:6789 error -101
libceph: osd3 (1) <vm-01 ip>:6803 error -101

copying publishing file ceph.client.admin.keyring file to /etc/ceph on all nodes resolves this issue
2. actually Incus unmounts all ceph volumes and storages in parallel with microceph daemons stopping. Not sure when instances were stopped.
3. Anyway, during full cluster shutdown (all nodes goes to power-off at the same time), at least 2 nodes stuck on Incus shutdown as it looses connectivity with ceph osds and monitors.

stgraber · April 17, 2024, 7:13pm

For a full cluster shutdown, I usually do incus cluster evacuate --action=stop NAME for each of the servers, so all instances get properly stopped ahead of time. Then hopefully there won’t be anything else needing ceph during the machine shutdown.

Then once the systems are back online and ceph is back to working order, you can use incus cluster restore to get all the instances back up and running.

tregubovav · April 17, 2024, 7:21pm

Thank you Stéphane for suggestion.

I have briefly checked the similar approach by stopping all instances manually before full cluster shutdown. Incus does not block host shutdown in this case as well.

tregubovav · April 17, 2024, 7:22pm

By the way, I reported the issue to microceph project already:

github.com/canonical/microceph

microceph daemons stop before ceph consumer daemons stop during host shutdown

opened 06:08PM - 17 Apr 24 UTC

tregubovav-dev

# microceph daemons stop before ceph consumer daemons stop All microceph daemon…s including osds and monitors stop before ceph consumers like LXD or Incus daemons during host shutdown, In some situation this behavior causes data loss and/or abnormal system behavior. For example graceful cluster shutdown during power outage. ## What version of MicroCeph are you using ? * microceph reef/stable from snap channel latest/stable * LXD 5.21 from snap channel latest/stable * Ubuntu 23.10 server (all packages are updated to latest versions) ## What are the steps to reproduce this issue ? 1. Deploy 3 host nodes (VM or physical) with Ubuntu 22.04 or 23.20 server and update all packages; attach one dedicated disk to every node which will be used for ceph storage 2. install microceph snap using latest/stable package 3. switch LXD snap to latest/stable channel 4. configure microceph cluster and join all nodes to the cluster; add disk to the cluster 5. configure LXD cluster and join all nodes to it. configure ceph storage for LXD cluster 6. restart all nodes at the same time and watch on shutdown logs output on screen. Yo may see that microceph daemons stop before lxd daemons (see screenshot below) ![image](https://github.com/canonical/microceph/assets/83779241/3592e5c8-9aef-402e-8392-ef368d994275); However, this does not impact on shutdown process while none of instances is deployed to ceph storage and run 7. deploy and launch any instances to LXD using ceph storage 8. restart all nodes at the same time ## What happens (observed behaviour) ? - all nodes stuck in shutdown and wait for LXD services being stopped (up to 10 minutes in my case) ![image](https://github.com/canonical/microceph/assets/83779241/90b76b0d-e4bc-49f6-aa92-80d2d0427d5e) - after some timeout libceph starts reporting lost of communication with osds and monitors: ![image](https://github.com/canonical/microceph/assets/83779241/546038e5-4493-457f-b579-65ddacb1062c) LXD can't communicate to ceph as all monitors and osds in cluster are shutdown already, but LXD instances are running but they lost ceph storage already ## What were you expecting to happen ? LXD and other ceph consumers must be stopped before microceph services going to stop during host shutdown. ## Relevant logs, error output, etc. If it’s considerably long, please paste to https://gist.github.com/ and insert the link here. ## Additional comments. …