LXD Error: Get "http://unix.socket/1.0": dial unix /var/snap/lxd/common/lxd/unix.socket: connect: connection refused

Ok, so the problem here is it not finding the my-lvm VG on this system.

What do the vgs and lvs command show?

I don’t need lvm

root@lxd:~# vgs
VG #PV #LV #SN Attr VSize VFree
vg1 1 1 0 wz–n- <1024.00g 0
root@lxd:~# lvs
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
vol1 vg1 -wi-a----- <1024.00g
root@lxd:~#

The error suggests you used to have a storage pool called my-lvm and that rather than deleting the storage pool cleanly using lxc storage deletr my-lvm the LVM volume group has been removed without LXD knowing about it.

This will prevent LXD from starting up.

Probably the easiest thing to do is to manually create an empty volume group called my-lvm, let LXD start, and then delete the storage pool cleanly using lxc storage delete my-lvm.

I’ve hit a similar problem recently (/snap/bin/lxc -v shows 4.19) :

Error: Failed initializing storage pool “zvolssd”: Thin pool not found “LXDThinPool” in volume group “zvolssd”

Checking through my notes, I used zfs to create a filesystem, then used lxc storage create to create an lvm source on that filesystem:

# apt-get install thin-provisioning-tools
# zfs create -V 50G ssd0/ssdstore/zvols
# lxc storage create zvolssd lvm source=/dev/ssd0/ssdstore/zvols

and I can check and see the volume:

# file /dev/ssd0/ssdstore/zvols 
/dev/ssd0/ssdstore/zvols: symbolic link to ../../zd0
# fdisk -l /dev/zd0
Disk /dev/zd0: 50 GiB, 53687091200 bytes, 104857600 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 8192 bytes
I/O size (minimum/optimal): 8192 bytes / 8192 bytes
# blkid /dev/zd0
/dev/zd0: UUID="RNEf4Y-TdJn-ptkx-62tr-weWG-F6Oi-8OfSo1" TYPE="LVM2_member"
# file -s /dev/zd0
/dev/zd0: LVM2 PV (Linux Logical Volume Manager), UUID: RNEf4Y-TdJn-ptkx-62tr-weWG-F6Oi-8OfSo1, size: 53687091200

Ah, this will be because since 4.19 we’ve started checking that LVM pools have their volume group and thin pools existing and active:

If you used to have an LVM pool but did not remove it via lxc storage delete <pool> but rather just deleted the backing volume group or thin pool then this will cause the issue going forward.

If you cannot temporarily restore the volume group and thinpool to allow LXD to start, then you’re going to need to use a /var/snap/lxd/common/lxd/database/patch.global.sql to repair the database manually.

So lets recreate the scenario:

lxd init --auto
lxc storage create lvm lvm
vgs
  VG  #PV #LV #SN Attr   VSize VFree
  lvm   1   1   0 wz--n- 4.65g    0 
vgremove lvm
Do you really want to remove volume group "lvm" containing 1 logical volumes? [y/n]: y
Do you really want to remove and DISCARD active logical volume lvm/LXDThinPool? [y/n]: y
  Logical volume "LXDThinPool" successfully removed
  Volume group "lvm" successfully removed
sudo systemctl reload snap.lxd.daemon
lxc ls
Error: Get "http://unix.socket/1.0": EOF
journalctl -b | grep lvm | grep Failed
Oct 08 08:02:56 v1 lxd.daemon[9851]: Error: Failed initializing storage pool "lvm": Volume group "lvm" not found

So now we need to run a database patch on LXD startup to remove the LVM pool record:

Create a file /var/snap/lxd/common/lxd/database/patch.global.sql:

DELETE FROM storage_pools WHERE name = "<pool>";

Then reload LXD:

sudo systemctl reload snap.lxd.daemon
1 Like

Thanks for the quick reply, that’s exactly what I needed to do, I took a dump of global (just in case anything went wrong) with:

sqlite3 /var/snap/lxd/common/lxd/database/global/db.bin .dump > db.dump

Then after applying the patch.global.sql with the name of the errant pool, I was able to get lxd running again.

1 Like

As an aside though, this sort of handling of a missing filesystem seems to be a bit too “fail-hard” versus “fail-gracefully”? I can understand not starting the various containers that depend on the missing/failed filesystem, but the entire container ecosystem failing to start seems rather drastic?

Indeed, we have an issue for this open, however it is non-trivial as @stgraber explains here:

OK noted, thanks again.

Also, I’d like to pen a note of thanks to all involved in LXD. I started using LXD about 2-3 years ago with v2/3.x, and v4 today has certainly a lot better functionality as well as management tools. So thanks again for all your efforts

1 Like

root@lxd:~# pvdisplay
root@lxd:~# vgdisplay
root@lxd:~# lvdisplay
root@lxd:~# lxc list
Error: Get “http://unix.socket/1.0”: dial unix /var/snap/lxd/common/lxd/unix.socket: connect: connection refused

Thanks for the solution. I created a storage pool for docker but deleted it as I was not using it and was receiving the same error.

But this fixed it.

1 Like