Ok, so the problem here is it not finding the my-lvm
VG on this system.
What do the vgs
and lvs
command show?
Ok, so the problem here is it not finding the my-lvm
VG on this system.
What do the vgs
and lvs
command show?
I don’t need lvm
root@lxd:~# vgs
VG #PV #LV #SN Attr VSize VFree
vg1 1 1 0 wz–n- <1024.00g 0
root@lxd:~# lvs
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
vol1 vg1 -wi-a----- <1024.00g
root@lxd:~#
The error suggests you used to have a storage pool called my-lvm
and that rather than deleting the storage pool cleanly using lxc storage deletr my-lvm
the LVM volume group has been removed without LXD knowing about it.
This will prevent LXD from starting up.
Probably the easiest thing to do is to manually create an empty volume group called my-lvm
, let LXD start, and then delete the storage pool cleanly using lxc storage delete my-lvm
.
I’ve hit a similar problem recently (/snap/bin/lxc -v shows 4.19) :
Error: Failed initializing storage pool “zvolssd”: Thin pool not found “LXDThinPool” in volume group “zvolssd”
Checking through my notes, I used zfs to create a filesystem, then used lxc storage create to create an lvm source on that filesystem:
# apt-get install thin-provisioning-tools
# zfs create -V 50G ssd0/ssdstore/zvols
# lxc storage create zvolssd lvm source=/dev/ssd0/ssdstore/zvols
and I can check and see the volume:
# file /dev/ssd0/ssdstore/zvols
/dev/ssd0/ssdstore/zvols: symbolic link to ../../zd0
# fdisk -l /dev/zd0
Disk /dev/zd0: 50 GiB, 53687091200 bytes, 104857600 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 8192 bytes
I/O size (minimum/optimal): 8192 bytes / 8192 bytes
# blkid /dev/zd0
/dev/zd0: UUID="RNEf4Y-TdJn-ptkx-62tr-weWG-F6Oi-8OfSo1" TYPE="LVM2_member"
# file -s /dev/zd0
/dev/zd0: LVM2 PV (Linux Logical Volume Manager), UUID: RNEf4Y-TdJn-ptkx-62tr-weWG-F6Oi-8OfSo1, size: 53687091200
Ah, this will be because since 4.19 we’ve started checking that LVM pools have their volume group and thin pools existing and active:
If you used to have an LVM pool but did not remove it via lxc storage delete <pool>
but rather just deleted the backing volume group or thin pool then this will cause the issue going forward.
If you cannot temporarily restore the volume group and thinpool to allow LXD to start, then you’re going to need to use a /var/snap/lxd/common/lxd/database/patch.global.sql
to repair the database manually.
So lets recreate the scenario:
lxd init --auto
lxc storage create lvm lvm
vgs
VG #PV #LV #SN Attr VSize VFree
lvm 1 1 0 wz--n- 4.65g 0
vgremove lvm
Do you really want to remove volume group "lvm" containing 1 logical volumes? [y/n]: y
Do you really want to remove and DISCARD active logical volume lvm/LXDThinPool? [y/n]: y
Logical volume "LXDThinPool" successfully removed
Volume group "lvm" successfully removed
sudo systemctl reload snap.lxd.daemon
lxc ls
Error: Get "http://unix.socket/1.0": EOF
journalctl -b | grep lvm | grep Failed
Oct 08 08:02:56 v1 lxd.daemon[9851]: Error: Failed initializing storage pool "lvm": Volume group "lvm" not found
So now we need to run a database patch on LXD startup to remove the LVM pool record:
Create a file /var/snap/lxd/common/lxd/database/patch.global.sql
:
DELETE FROM storage_pools WHERE name = "<pool>";
Then reload LXD:
sudo systemctl reload snap.lxd.daemon
Thanks for the quick reply, that’s exactly what I needed to do, I took a dump of global (just in case anything went wrong) with:
sqlite3 /var/snap/lxd/common/lxd/database/global/db.bin .dump > db.dump
Then after applying the patch.global.sql with the name of the errant pool, I was able to get lxd running again.
As an aside though, this sort of handling of a missing filesystem seems to be a bit too “fail-hard” versus “fail-gracefully”? I can understand not starting the various containers that depend on the missing/failed filesystem, but the entire container ecosystem failing to start seems rather drastic?
Indeed, we have an issue for this open, however it is non-trivial as @stgraber explains here:
OK noted, thanks again.
Also, I’d like to pen a note of thanks to all involved in LXD. I started using LXD about 2-3 years ago with v2/3.x, and v4 today has certainly a lot better functionality as well as management tools. So thanks again for all your efforts
root@lxd:~# pvdisplay
root@lxd:~# vgdisplay
root@lxd:~# lvdisplay
root@lxd:~# lxc list
Error: Get “http://unix.socket/1.0”: dial unix /var/snap/lxd/common/lxd/unix.socket: connect: connection refused
Thanks for the solution. I created a storage pool for docker but deleted it as I was not using it and was receiving the same error.
But this fixed it.