LXD broken after setting custom backups storage

LXC 5.1 / LXD 5.1 / Snap on ubuntu 20.04

I had a problem importing a relatively large container because I kept running out of disk space on root like here.

So i tried to fix it by setting a custom backups storage like so:

lxc storage create backup_pool lvm size=100GiB
lxc storage volume create backup_pool backup_volume size=99GB
lxc config set storage.backups_volume backup_pool/backups_volume

Note that I’m using lvm storage pools on loopback files (auto-generated by lxd).
I previously had one called “defaultpool” and one called “thirdpool”. After the above I ended up with a third one called “backup_pool” as expected.

After that I tried the import again, but it still didn’t succeed. This time, after importing it all the way to 100% it ended with another error:
“Post “http://unix.socket/1.0/instances”: net/http: timeout awaiting response headers”

As I had spent a lot of time already I decided to resolve the issue another time.

When I started my system the next day I couldn’t connect to lxd anymore:

lxc list
Error: Get "http://unix.socket/1.0": EOF

I noticed, that lxd keeps initializing the storage pools over and over, adding more and more loopdevices for the respective images:

losetup|grep pool
/dev/loop57         0      0         0  0 /home/boother/var_snap_lxd_common_lxd_disks/thirdpool.img            1     512
/dev/loop47         0      0         0  0 /home/boother/var_snap_lxd_common_lxd_disks/defaultpool.img          1     512
/dev/loop75         0      0         0  0 /home/boother/var_snap_lxd_common_lxd_disks/backup_pool.img          1     512
/dev/loop37         0      0         0  0 /home/boother/var_snap_lxd_common_lxd_disks/backup_pool.img          1     512
/dev/loop65         0      0         0  0 /home/boother/var_snap_lxd_common_lxd_disks/defaultpool.img          1     512
/dev/loop55         0      0         0  0 /home/boother/var_snap_lxd_common_lxd_disks/backup_pool.img          1     512
/dev/loop45         0      0         0  0 /home/boother/var_snap_lxd_common_lxd_disks/thirdpool.img            1     512
/dev/loop73         0      0         0  0 /home/boother/var_snap_lxd_common_lxd_disks/backup_pool.img          1     512
/dev/loop35         0      0         0  0 /home/boother/var_snap_lxd_common_lxd_disks/defaultpool.img          1     512
/dev/loop63         0      0         0  0 /home/boother/var_snap_lxd_common_lxd_disks/thirdpool.img            1     512
/dev/loop53         0      0         0  0 /home/boother/var_snap_lxd_common_lxd_disks/defaultpool.img          1     512
/dev/loop43         0      0         0  0 /home/boother/var_snap_lxd_common_lxd_disks/defaultpool.img          1     512
/dev/loop71         0      0         0  0 /home/boother/var_snap_lxd_common_lxd_disks/defaultpool.img          1     512
/dev/loop61         0      0         0  0 /home/boother/var_snap_lxd_common_lxd_disks/backup_pool.img          1     512
/dev/loop51         0      0         0  0 /home/boother/var_snap_lxd_common_lxd_disks/thirdpool.img            1     512
/dev/loop41         0      0         0  0 /home/boother/var_snap_lxd_common_lxd_disks/thirdpool.img            1     512
/dev/loop68         0      0         0  0 /home/boother/var_snap_lxd_common_lxd_disks/defaultpool.img          1     512
/dev/loop58         0      0         0  0 /home/boother/var_snap_lxd_common_lxd_disks/backup_pool.img          1     512
/dev/loop48         0      0         0  0 /home/boother/var_snap_lxd_common_lxd_disks/thirdpool.img            1     512
/dev/loop38         0      0         0  0 /home/boother/var_snap_lxd_common_lxd_disks/thirdpool.img            1     512
/dev/loop66         0      0         0  0 /home/boother/var_snap_lxd_common_lxd_disks/thirdpool.img            1     512
/dev/loop56         0      0         0  0 /home/boother/var_snap_lxd_common_lxd_disks/defaultpool.img          1     512
/dev/loop46         0      0         0  0 /home/boother/var_snap_lxd_common_lxd_disks/backup_pool.img          1     512
/dev/loop74         0      0         0  0 /home/boother/var_snap_lxd_common_lxd_disks/thirdpool.img            1     512
/dev/loop36         0      0         0  0 /home/boother/var_snap_lxd_common_lxd_disks/thirdpool.img            1     512
/dev/loop64         0      0         0  0 /home/boother/var_snap_lxd_common_lxd_disks/backup_pool.img          1     512
/dev/loop54         0      0         0  0 /home/boother/var_snap_lxd_common_lxd_disks/thirdpool.img            1     512
/dev/loop44         0      0         0  0 /home/boother/var_snap_lxd_common_lxd_disks/defaultpool.img          1     512
/dev/loop72         0      0         0  0 /home/boother/var_snap_lxd_common_lxd_disks/thirdpool.img            1     512
/dev/loop62         0      0         0  0 /home/boother/var_snap_lxd_common_lxd_disks/defaultpool.img          1     512
/dev/loop52         0      0         0  0 /home/boother/var_snap_lxd_common_lxd_disks/backup_pool.img          1     512
/dev/loop42         0      0         0  0 /home/boother/var_snap_lxd_common_lxd_disks/backup_pool.img          1     512
/dev/loop70         0      0         0  0 /home/boother/var_snap_lxd_common_lxd_disks/backup_pool.img          1     512
/dev/loop60         0      0         0  0 /home/boother/var_snap_lxd_common_lxd_disks/thirdpool.img            1     512
/dev/loop50         0      0         0  0 /home/boother/var_snap_lxd_common_lxd_disks/defaultpool.img          1     512
/dev/loop40         0      0         0  0 /home/boother/var_snap_lxd_common_lxd_disks/defaultpool.img          1     512
/dev/loop69         0      0         0  0 /home/boother/var_snap_lxd_common_lxd_disks/thirdpool.img            1     512
/dev/loop59         0      0         0  0 /home/boother/var_snap_lxd_common_lxd_disks/defaultpool.img          1     512
/dev/loop49         0      0         0  0 /home/boother/var_snap_lxd_common_lxd_disks/backup_pool.img          1     512
/dev/loop39         0      0         0  0 /home/boother/var_snap_lxd_common_lxd_disks/backup_pool.img          1     512
/dev/loop67         0      0         0  0 /home/boother/var_snap_lxd_common_lxd_disks/backup_pool.img          1     512

I assume the root cause is this error about failing to activate the LVM logical volume during the initial startup:

May 13 00:21:25 boother-desktop systemd[2180]: Started Tracker metadata database store and lookup manager.
May 13 00:21:27 boother-desktop kernel: [  184.062419] znvpair: module license 'CDDL' taints kernel.
May 13 00:21:27 boother-desktop kernel: [  184.062421] Disabling lock debugging due to kernel taint
May 13 00:21:29 boother-desktop kernel: [  186.050049] ZFS: Loaded module v0.8.3-1ubuntu12.13, ZFS pool version 5000, ZFS filesystem version 5
May 13 00:21:29 boother-desktop systemd[1]: Created slice system-lvm2\x2dpvscan.slice.
May 13 00:21:29 boother-desktop systemd[1]: Starting LVM event activation on device 7:35...
May 13 00:21:29 boother-desktop lvm[4142]:   pvscan[4142] PV /dev/loop35 online, VG defaultpool is complete.
May 13 00:21:29 boother-desktop lvm[4142]:   pvscan[4142] VG defaultpool run autoactivation.
May 13 00:21:30 boother-desktop systemd[1]: Starting LVM event activation on device 7:36...
May 13 00:21:30 boother-desktop lvm[4185]:   pvscan[4185] PV /dev/loop36 online, VG thirdpool is complete.
May 13 00:21:30 boother-desktop lvm[4185]:   pvscan[4185] VG thirdpool run autoactivation.
May 13 00:21:30 boother-desktop lvm[4185]:   PVID twqVmZ-l139-5akj-rXn3-Za23-v4Vh-cvbK5d read from /dev/loop36 last written to /dev/loop37.
May 13 00:21:30 boother-desktop lvm[4185]:   pvscan[4185] VG thirdpool not using quick activation.
May 13 00:21:30 boother-desktop systemd[1]: Started Device-mapper event daemon.
May 13 00:21:30 boother-desktop dmeventd[4230]: dmeventd ready for processing.
May 13 00:21:30 boother-desktop lvm[4230]: Monitoring thin pool thirdpool-LXDThinPool-tpool.
May 13 00:21:30 boother-desktop lvm[4185]:   1 logical volume(s) in volume group "thirdpool" now active
May 13 00:21:30 boother-desktop lvm[4230]: Monitoring thin pool defaultpool-LXDThinPool-tpool.
May 13 00:21:30 boother-desktop systemd[1]: Starting LVM event activation on device 7:37...
May 13 00:21:30 boother-desktop lvm[4142]:   1 logical volume(s) in volume group "defaultpool" now active
May 13 00:21:30 boother-desktop systemd[1]: Finished LVM event activation on device 7:36.
May 13 00:21:30 boother-desktop lvm[4301]:   pvscan[4301] PV /dev/loop37 online, VG backup_pool is complete.
May 13 00:21:30 boother-desktop lvm[4301]:   pvscan[4301] VG backup_pool run autoactivation.
May 13 00:21:30 boother-desktop systemd[1]: Finished LVM event activation on device 7:35.
May 13 00:21:30 boother-desktop lxd.daemon[4054]: time="2022-05-13T00:21:30+02:00" level=error msg="Failed to start the daemon" err="Failed to mount backups storage: Failed to mount storage volume \"backup_pool/backup_volume\": Failed to activate LVM logical volume \"/dev/backup_pool/custom_default_backup_volume\": Failed to run: lvchange --activate y --ignoreactivationskip /dev/backup_pool/custom_default_backup_volume: Activation of logical volume backup_pool/custom_default_backup_volume is prohibited while logical volume backup_pool/LXDThinPool_tmeta is active."
May 13 00:21:30 boother-desktop lxd.daemon[4054]: Error: Failed to mount backups storage: Failed to mount storage volume "backup_pool/backup_volume": Failed to activate LVM logical volume "/dev/backup_pool/custom_default_backup_volume": Failed to run: lvchange --activate y --ignoreactivationskip /dev/backup_pool/custom_default_backup_volume: Activation of logical volume backup_pool/custom_default_backup_volume is prohibited while logical volume backup_pool/LXDThinPool_tmeta is active.
May 13 00:21:30 boother-desktop systemd[1]: snap.lxd.lxd.1e44d4dc-6e0e-46a5-ad4f-42d1439d2216.scope: Succeeded.
May 13 00:21:31 boother-desktop lxd.daemon[3878]: => LXD failed to start
May 13 00:21:31 boother-desktop systemd[1]: snap.lxd.daemon.service: Main process exited, code=exited, status=1/FAILURE
May 13 00:21:31 boother-desktop systemd[1]: snap.lxd.daemon.service: Failed with result 'exit-code'.
May 13 00:21:31 boother-desktop systemd[1]: snap.lxd.daemon.service: Scheduled restart job, restart counter is at 1.
May 13 00:21:31 boother-desktop systemd[1]: Stopped Service for snap application lxd.daemon.

It keeps retrying to start LXD and initializing the pools, but then LVM complains about the duplicate devices.

I already tried deactivating the lvm volumes manually, then starting lxc again, but it didn’t help.
Any ideas what I could try to fix this?

OK so we need to sort out this issue:

The timeout was introduced in LXD 5.1’s lxc client to add some basic real-world network handling (as connections do stall sometimes and had the possibility of hanging indefinitely). However it has seemed to introduce some occasional issues that so far no one has been able to identify what is causing the HTTP headers to take so long to be recieved. See https://github.com/lxc/lxd/issues/10377

But back to your primary problem.
I found this which should help you to disable the offending volumes:

I would suggest that first you stop LXD using:

systemctl stop snap.lxd.daemon.service snap.lxd.daemon.unix.socket

And check there are no lxd processes running on your system.

Then can you show the output of sudo lvs to show the current state of your logical volumes.

I’ve effectively reverted the timeouts here:

I actually already did that.

That’s what I meant when I said

I already tried deactivating the lvm volumes manually, then starting lxc again, but it didn’t help.

I did however miss the last step:
vgchange -ay

I’ll try it again properly and get back to you.

It would also be good to get the system and LXD logs from the time of the original timeout response, as even though the request timed out it can’t have caused the LVM metadata volume to become left in an unexpected state (as LXD doesn’t manage that volume its managed by the underlying LVM subsystem), so if that has become corrupted or left in an unexpected state, it suggests something else is happening on that system (which may also explain why it took too long to respond as a side effect).

Perhaps disk space is a problem.

1 Like

Thank you so much and sorry for wasting your time.
Turns out I’m just an idiot.

Almost two years ago, as I was running out of space on my rootfs I moved my storage pool’s image files to another disk. I did so by LINKING to the new folder from /var/snap/lxd/common/lxd/disks.

Even though Stéphane highly discourages it (which I wasn’t aware of at the time) it worked great ever since.

I think I never changed anything about the storeage pools since then, so when I created the new pool now everything got mangled.

I just now removed the link and mounted the folder instead. Now everything is back to normal.

Again, I am sorry for wasting your time and thanks you so much for your great work an quick responses!

1 Like