LXD Error: Get "http://unix.socket/1.0": dial unix /var/snap/lxd/common/lxd/unix.socket: connect: connection refused

Oct 06 02:20:31 lxd snapd[2967]: storehelpers.go:551: cannot refresh: snap has no updates available: “core18”, “core20”, “lxd”, “snapd”

looks suspicious.

Please can you show output of sudo snap info lxd and ps aux | grep lxd.

root@lxd:~# sudo snap info lxd and ps aux | grep lxd
name: lxd
store-url: Install lxd on Linux | Snap Store
contact: https://github.com/lxc/lxd/issues
Supported configuration options for the snap (snap set lxd
[default=lxd]
- daemon.preseed: Pass a YAML configuration to lxd init on initial
/var/snap/lxd/common/global-conf/ (config.yml and servercerts)

  • lxd.benchmark
  • lxd.buginfo
  • lxd.check-kernel
  • lxd.lxc
  • lxd.lxc-to-lxd
  • lxd
  • lxd.migrate
    lxd.activate: oneshot, enabled, inactive
    lxd.daemon: simple, enabled, inactive

No, they were separate commands (they were separately quoted):

sudo snap info lxd

and

ps aux | grep lx

root@lxd:~# sudo snap info lxd
name: lxd
summary: LXD - container and VM manager
publisher: Canonical✓
store-url: Install lxd on Linux | Snap Store
contact: https://github.com/lxc/lxd/issues
license: unset
description: |
LXD is a system container and virtual machine manager.

It offers a simple CLI and REST API to manage local or remote instances,
uses an image based workflow and support for a variety of advanced features.

Images are available for all Ubuntu releases and architectures as well
as for a wide number of other Linux distributions. Existing
integrations with many deployment and operation tools, makes it work
just like a public cloud, except everything is under your control.

LXD containers are lightweight, secure by default and a great
alternative to virtual machines when running Linux on Linux.

LXD virtual machines are modern and secure, using UEFI and secure-boot
by default and a great choice when a different kernel or operating
system is needed.

With clustering, up to 50 LXD servers can be easily joined and managed
together with the same tools and APIs and without needing any external
dependencies.

Supported configuration options for the snap (snap set lxd [=…]):

- ceph.builtin: Use snap-specific Ceph configuration [default=false]
- ceph.external: Use the system's ceph tools (ignores ceph.builtin) [default=false]
- criu.enable: Enable experimental live-migration support [default=false]
- daemon.debug: Increase logging to debug level [default=false]
- daemon.group: Set group of users that can interact with LXD [default=lxd]
- daemon.preseed: Pass a YAML configuration to `lxd init` on initial start
- daemon.syslog: Send LXD log events to syslog [default=false]
- lvm.external: Use the system's LVM tools [default=false]
- lxcfs.pidfd: Start per-container process tracking [default=false]
- lxcfs.loadavg: Start tracking per-container load average [default=false]
- lxcfs.cfs: Consider CPU shares for CPU usage [default=false]
- openvswitch.builtin: Run a snap-specific OVS daemon [default=false]
- shiftfs.enable: Enable shiftfs support [default=auto]

For system-wide configuration of the CLI, place your configuration in
/var/snap/lxd/common/global-conf/ (config.yml and servercerts)
commands:

  • lxd.benchmark
  • lxd.buginfo
  • lxd.check-kernel
  • lxd.lxc
  • lxd.lxc-to-lxd
  • lxd
  • lxd.migrate
    services:
    lxd.activate: oneshot, enabled, inactive
    lxd.daemon: simple, enabled, inactive
    snap-id: J60k4JY0HppjwOjW8dZdYc8obXKxujRu
    tracking: latest/stable
    refresh-date: yesterday at 06:35 BST
    channels:
    latest/stable: 4.18 2021-09-13 (21497) 75MB -
    latest/candidate: 4.19 2021-10-05 (21654) 76MB -
    latest/beta: ↑
    latest/edge: git-737447f 2021-10-06 (21670) 76MB -
    4.19/stable: –
    4.19/candidate: 4.19 2021-10-05 (21654) 76MB -
    4.19/beta: ↑
    4.19/edge: ↑
    4.18/stable: 4.18 2021-09-13 (21497) 75MB -
    4.18/candidate: 4.18 2021-09-15 (21554) 75MB -
    4.18/beta: ↑
    4.18/edge: ↑
    4.0/stable: 4.0.7 2021-10-04 (21545) 70MB -
    4.0/candidate: 4.0.7 2021-10-04 (21545) 70MB -
    4.0/beta: ↑
    4.0/edge: git-2dccccd 2021-10-05 (21628) 70MB -
    3.0/stable: 3.0.4 2019-10-10 (11348) 55MB -
    3.0/candidate: 3.0.4 2019-10-10 (11348) 55MB -
    3.0/beta: ↑
    3.0/edge: git-81b81b9 2019-10-10 (11362) 55MB -
    2.0/stable: 2.0.12 2020-08-18 (16879) 38MB -
    2.0/candidate: 2.0.12 2021-03-22 (19859) 39MB -
    2.0/beta: ↑
    2.0/edge: git-82c7d62 2021-03-22 (19857) 39MB -
    installed: 4.19 (21624) 76MB -

root@lxd:~# ps aux | grep lx
root 3812 1.1 0.0 3068436 12016 ? Sl Oct03 51:17 lxcfs /var/snap/lxd/common/var/lib/lxcfs -p /var/snap/lxd/common/lxcfs.pid
lxd 3984 0.0 0.0 7204 3584 ? Ss Oct03 0:01 dnsmasq --keep-in-foreground --strict-order --bind-interfaces --except-interface=lo --pid-file= --no-ping --interface=lxdbr0 --dhcp-rapid-commit --quiet-dhcp --quiet-dhcp6 --quiet-ra --listen-address=10.84.149.1 --dhcp-no-override --dhcp-authoritative --dhcp-leasefile=/var/snap/lxd/common/lxd/networks/lxdbr0/dnsmasq.leases --dhcp-hostsfile=/var/snap/lxd/common/lxd/networks/lxdbr0/dnsmasq.hosts --dhcp-range 10.84.149.2,10.84.149.254,1h --listen-address=fd42:18b9:a901:b901::1 --enable-ra --dhcp-range ::,constructor:lxdbr0,ra-stateless,ra-names -s lxd --interface-name _gateway.lxd,lxdbr0 -S /lxd/ --conf-file=/var/snap/lxd/common/lxd/networks/lxdbr0/dnsmasq.raw -u lxd -g lxd
root 4738 0.0 0.0 2163956 19148 ? Ss Oct03 0:00 [lxc monitor] /var/snap/lxd/common/lxd/containers cNTLDNLSAPPP001
root 9643 0.0 0.0 2163444 19228 ? Ss Oct03 0:00 [lxc monitor] /var/snap/lxd/common/lxd/containers cNTLDNLSAPPU001
root 11640 0.0 0.0 2237688 18276 ? Ss Oct03 0:00 [lxc monitor] /var/snap/lxd/common/lxd/containers cNTLDNLSSQLP001
root 15754 0.0 0.0 2311420 19428 ? Ss Oct03 0:00 [lxc monitor] /var/snap/lxd/common/lxd/containers cNTLDNLSSQLU001
root 16172 0.0 0.0 3640 1152 ? Ss Oct03 0:00 snapfuse /var/lib/snapd/snaps/lxd_21029.snap /snap/lxd/21029 -o ro,nodev,allow_other,suid
root 16177 0.0 0.0 3508 164 ? Ss Oct03 0:00 snapfuse /var/lib/snapd/snaps/lxd_20326.snap /snap/lxd/20326 -o ro,nodev,allow_other,suid
1000000 21831 0.0 0.0 3508 164 ? Ss Oct03 0:00 snapfuse /var/lib/snapd/snaps/lxd_21029.snap /snap/lxd/21029 -o ro,nodev,allow_other,suid
1000000 21833 0.0 0.0 3708 1224 ? Ss Oct03 0:00 snapfuse /var/lib/snapd/snaps/lxd_21545.snap /snap/lxd/21545 -o ro,nodev,allow_other,suid
root 22354 0.0 0.0 2163956 18932 ? Ss Oct03 0:00 [lxc monitor] /var/snap/lxd/common/lxd/containers p-promt-cNTLDNSMSP02
1000000 23404 0.0 0.0 3508 164 ? Ss Oct03 0:00 snapfuse /var/lib/snapd/snaps/lxd_21029.snap /snap/lxd/21029 -o ro,nodev,allow_other,suid
1000000 23407 0.0 0.0 3956 1404 ? Ss Oct03 0:01 snapfuse /var/lib/snapd/snaps/lxd_21545.snap /snap/lxd/21545 -o ro,nodev,allow_other,suid
root 719003 0.0 0.0 2237432 18972 ? Ss Oct03 0:00 [lxc monitor] /var/snap/lxd/common/lxd/containers TA-cNTLDNSMSP03
1000000 719511 0.0 0.0 3508 1100 ? Ss Oct03 0:15 snapfuse /var/lib/snapd/snaps/lxd_21029.snap /snap/lxd/21029 -o ro,nodev,allow_other,suid
1000000 719526 0.0 0.0 3956 1532 ? Ss Oct03 0:15 snapfuse /var/lib/snapd/snaps/lxd_21545.snap /snap/lxd/21545 -o ro,nodev,allow_other,suid
root 1862671 0.0 0.0 6432 2540 pts/0 R+ 09:43 0:00 grep --color=auto lx

Hrm, I’m not sure.

Any ideas @stgraber

Those AppArmor denials look problematic:

 Oct 06 00:46:48 lxd audit[3043716]: AVC apparmor=“DENIED” operation=“mount” info=“failed flags match” error=-13 profile="lxd-cNTLDNLSSQLP001_</var/snap/lxd/common/l>

I have another issue with another server

root@lxd1:~# lxc list
Error: Get “http://unix.socket/1.0”: EOF

Can you follow the same diagnostic steps as above, as the error you’ve posted here just tells us the same thing, that LXD isn’t running, but not why.

root@lxd1:~# sudo journalctl -r -n 200 -b -g lxd
– Logs begin at Wed 2019-05-29 05:15:35 +06, end at Wed 2021-10-06 14:55:49 +06. –
Oct 06 14:55:49 lxd1 sudo[1808579]: root : TTY=pts/1 ; PWD=/root ; USER=root ; COMMAND=/bin/journalctl -r -n 200 -b -g lxd
Oct 06 14:55:48 lxd1 lxd.daemon[1808406]: => Starting LXD
Oct 06 14:55:48 lxd1 systemd[1]: Started Service for snap application lxd.daemon.
Oct 06 14:55:48 lxd1 systemd[1]: Stopped Service for snap application lxd.daemon.
Oct 06 14:55:48 lxd1 systemd[1]: snap.lxd.daemon.service: Scheduled restart job, restart counter is at 22930.
Oct 06 14:55:48 lxd1 systemd[1]: snap.lxd.daemon.service: Failed with result ‘exit-code’.
Oct 06 14:55:48 lxd1 systemd[1]: snap.lxd.daemon.service: Main process exited, code=exited, status=1/FAILURE
Oct 06 14:55:48 lxd1 lxd.daemon[1808064]: => LXD failed to start
Oct 06 14:55:46 lxd1 lxd.daemon[1808064]: => Starting LXD
Oct 06 14:55:45 lxd1 systemd[1]: Started Service for snap application lxd.daemon.
Oct 06 14:55:45 lxd1 systemd[1]: Stopped Service for snap application lxd.daemon.
Oct 06 14:55:45 lxd1 systemd[1]: snap.lxd.daemon.service: Scheduled restart job, restart counter is at 22929.
Oct 06 14:55:45 lxd1 systemd[1]: snap.lxd.daemon.service: Failed with result ‘exit-code’.
Oct 06 14:55:45 lxd1 systemd[1]: snap.lxd.daemon.service: Main process exited, code=exited, status=1/FAILURE
Oct 06 14:55:45 lxd1 lxd.daemon[1807763]: => LXD failed to start
Oct 06 14:55:43 lxd1 lxd.daemon[1807763]: => Starting LXD
Oct 06 14:55:42 lxd1 systemd[1]: Started Service for snap application lxd.daemon.
Oct 06 14:55:42 lxd1 systemd[1]: Stopped Service for snap application lxd.daemon.
Oct 06 14:55:42 lxd1 systemd[1]: snap.lxd.daemon.service: Scheduled restart job, restart counter is at 22928.
Oct 06 14:55:42 lxd1 systemd[1]: snap.lxd.daemon.service: Failed with result ‘exit-code’.
Oct 06 14:55:42 lxd1 systemd[1]: snap.lxd.daemon.service: Main process exited, code=exited, status=1/FAILURE
Oct 06 14:55:42 lxd1 lxd.daemon[1807552]: => LXD failed to start
Oct 06 14:55:40 lxd1 lxd.daemon[1807552]: => Starting LXD
Oct 06 14:55:39 lxd1 systemd[1]: Started Service for snap application lxd.daemon.
Oct 06 14:55:39 lxd1 systemd[1]: Stopped Service for snap application lxd.daemon.
Oct 06 14:55:39 lxd1 systemd[1]: snap.lxd.daemon.service: Scheduled restart job, restart counter is at 22927.
Oct 06 14:55:39 lxd1 systemd[1]: snap.lxd.daemon.service: Failed with result ‘exit-code’.
Oct 06 14:55:39 lxd1 systemd[1]: snap.lxd.daemon.service: Main process exited, code=exited, status=1/FAILURE
Oct 06 14:55:39 lxd1 lxd.daemon[1807235]: => LXD failed to start
Oct 06 14:55:37 lxd1 lxd.daemon[1807235]: => Starting LXD
Oct 06 14:55:36 lxd1 systemd[1]: Started Service for snap application lxd.daemon.
Oct 06 14:55:36 lxd1 systemd[1]: Stopped Service for snap application lxd.daemon.
Oct 06 14:55:36 lxd1 systemd[1]: snap.lxd.daemon.service: Scheduled restart job, restart counter is at 22926.
Oct 06 14:55:36 lxd1 systemd[1]: snap.lxd.daemon.service: Failed with result ‘exit-code’.
Oct 06 14:55:36 lxd1 systemd[1]: snap.lxd.daemon.service: Main process exited, code=exited, status=1/FAILURE
Oct 06 14:55:36 lxd1 lxd.daemon[1807031]: => LXD failed to start
Oct 06 14:55:34 lxd1 lxd.daemon[1807031]: => Starting LXD
Oct 06 14:55:33 lxd1 systemd[1]: Started Service for snap application lxd.daemon.
Oct 06 14:55:33 lxd1 systemd[1]: Stopped Service for snap application lxd.daemon.
Oct 06 14:55:33 lxd1 systemd[1]: snap.lxd.daemon.service: Scheduled restart job, restart counter is at 22925.
Oct 06 14:55:33 lxd1 systemd[1]: snap.lxd.daemon.service: Failed with result ‘exit-code’.
Oct 06 14:55:33 lxd1 systemd[1]: snap.lxd.daemon.service: Main process exited, code=exited, status=1/FAILURE
Oct 06 14:55:33 lxd1 lxd.daemon[1806660]: => LXD failed to start
Oct 06 14:55:31 lxd1 lxd.daemon[1806660]: => Starting LXD
Oct 06 14:55:30 lxd1 systemd[1]: Started Service for snap application lxd.daemon.

root@lxd1:~# sudo snap info lxd
name: lxd
summary: LXD - container and VM manager
publisher: Canonical✓
store-url: Install lxd on Linux | Snap Store
contact: https://github.com/lxc/lxd/issues
license: unset
description: |
LXD is a system container and virtual machine manager.

It offers a simple CLI and REST API to manage local or remote instances,
uses an image based workflow and support for a variety of advanced features.

Images are available for all Ubuntu releases and architectures as well
as for a wide number of other Linux distributions. Existing
integrations with many deployment and operation tools, makes it work
just like a public cloud, except everything is under your control.

LXD containers are lightweight, secure by default and a great
alternative to virtual machines when running Linux on Linux.

LXD virtual machines are modern and secure, using UEFI and secure-boot
by default and a great choice when a different kernel or operating
system is needed.

With clustering, up to 50 LXD servers can be easily joined and managed
together with the same tools and APIs and without needing any external
dependencies.

Supported configuration options for the snap (snap set lxd [=…]):

- ceph.builtin: Use snap-specific Ceph configuration [default=false]
- ceph.external: Use the system's ceph tools (ignores ceph.builtin) [default=false]
- criu.enable: Enable experimental live-migration support [default=false]
- daemon.debug: Increase logging to debug level [default=false]
- daemon.group: Set group of users that can interact with LXD [default=lxd]
- daemon.preseed: Pass a YAML configuration to `lxd init` on initial start
- daemon.syslog: Send LXD log events to syslog [default=false]
- lvm.external: Use the system's LVM tools [default=false]
- lxcfs.pidfd: Start per-container process tracking [default=false]
- lxcfs.loadavg: Start tracking per-container load average [default=false]
- lxcfs.cfs: Consider CPU shares for CPU usage [default=false]
- openvswitch.builtin: Run a snap-specific OVS daemon [default=false]
- shiftfs.enable: Enable shiftfs support [default=auto]

For system-wide configuration of the CLI, place your configuration in
/var/snap/lxd/common/global-conf/ (config.yml and servercerts)
commands:

  • lxd.benchmark
  • lxd.buginfo
  • lxd.check-kernel
  • lxd.lxc
  • lxd.lxc-to-lxd
  • lxd
  • lxd.migrate
    services:
    lxd.activate: oneshot, enabled, inactive
    lxd.daemon: simple, enabled, active
    snap-id: J60k4JY0HppjwOjW8dZdYc8obXKxujRu
    tracking: latest/stable
    refresh-date: yesterday at 19:49 +06
    channels:
    latest/stable: 4.18 2021-09-13 (21497) 75MB -
    latest/candidate: 4.19 2021-10-05 (21654) 76MB -
    latest/beta: ↑
    latest/edge: git-737447f 2021-10-06 (21670) 76MB -
    4.19/stable: –
    4.19/candidate: 4.19 2021-10-05 (21654) 76MB -
    4.19/beta: ↑
    4.19/edge: ↑
    4.18/stable: 4.18 2021-09-13 (21497) 75MB -
    4.18/candidate: 4.18 2021-09-15 (21554) 75MB -
    4.18/beta: ↑
    4.18/edge: ↑
    4.0/stable: 4.0.7 2021-10-04 (21545) 70MB -
    4.0/candidate: 4.0.7 2021-10-04 (21545) 70MB -
    4.0/beta: ↑
    4.0/edge: git-2dccccd 2021-10-05 (21628) 70MB -
    3.0/stable: 3.0.4 2019-10-10 (11348) 55MB -
    3.0/candidate: 3.0.4 2019-10-10 (11348) 55MB -
    3.0/beta: ↑
    3.0/edge: git-81b81b9 2019-10-10 (11362) 55MB -
    2.0/stable: 2.0.12 2020-08-18 (16879) 38MB -
    2.0/candidate: 2.0.12 2021-03-22 (19859) 39MB -
    2.0/beta: ↑
    2.0/edge: git-82c7d62 2021-03-22 (19857) 39MB -
    installed: 4.19

root@lxd1:~# ps aux | grep lx
root 4245 0.0 0.0 298992 2484 ? Ssl Sep06 0:27 /usr/bin/lxcfs /var/lib/lxcfs
root 4885 0.0 0.0 743604 4788 ? Sl Sep06 17:34 lxcfs /var/snap/lxd/common/var/lib/lxcfs -p /var/snap/lxd/common/lxcfs.pid
root 1813685 10.0 0.0 2624 1760 ? Ss 14:56 0:00 /bin/sh /snap/lxd/21624/commands/daemon.start
root 1813823 42.0 0.0 2091196 85760 ? Sl 14:56 0:00 lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
root 1813824 15.0 0.0 1794792 41092 ? Sl 14:56 0:00 lxd waitready
root 1813825 0.0 0.0 2624 128 ? S 14:56 0:00 /bin/sh /snap/lxd/21624/commands/daemon.start
root 1813865 0.0 0.0 6436 740 pts/1 S+ 14:56 0:00 grep --color=auto lx
root 2262451 0.0 0.0 1718528 17568 ? Ss Oct03 0:03 [lxc monitor] /var/snap/lxd/common/lxd/containers grafana-prod
root 2264438 0.0 0.0 1646460 17872 ? Ss Oct03 0:02 [lxc monitor] /var/snap/lxd/common/lxd/containers grafana-uat

Are your four servers in a cluster?

What does /var/snap/lxd/common/lxd/logs/lxd.log container on the 2nd server?

NO, two servers are in same location and down server is in running server remote list. Here is the running server log

root@lxd:/mnt/L-NFS/Image# tail -f /var/snap/lxd/common/lxd/logs/lxd.log
t=2021-10-06T13:13:42+0600 lvl=info msg=“Done pruning expired instance backups”
t=2021-10-06T13:13:42+0600 lvl=info msg=“Done updating images”
t=2021-10-06T14:13:42+0600 lvl=info msg=“Pruning expired instance backups”
t=2021-10-06T14:13:42+0600 lvl=info msg=“Updating images”
t=2021-10-06T14:13:42+0600 lvl=info msg=“Done pruning expired instance backups”
t=2021-10-06T14:13:42+0600 lvl=info msg=“Done updating images”
t=2021-10-06T15:13:42+0600 lvl=info msg=“Updating images”
t=2021-10-06T15:13:42+0600 lvl=info msg=“Pruning expired instance backups”
t=2021-10-06T15:13:42+0600 lvl=info msg=“Done updating images”
t=2021-10-06T15:13:42+0600 lvl=info msg=“Done pruning expired instance backups”

@tomp any idea? Could anyone help?

Sorry, it’s a bit of a mess going through all the partial information above.

On all your machines, please run:

  • snap refresh lxd
  • snap list lxd
  • lxc info
  • journalctl -u snap.lxd.daemon -n 300

root@lxd:~# snap refresh lxd
snap “lxd” has no updates available
root@lxd:~# snap list lxd
Name Version Rev Tracking Publisher Notes
lxd 4.19 21624 latest/stable canonical✓ -
root@lxd:~# lxc info
Error: Get “http://unix.socket/1.0”: dial unix /var/snap/lxd/common/lxd/unix.socket: connect: connection refused
root@lxd:~# journalctl -u snap.lxd.daemon -n 300
– Logs begin at Thu 2021-02-25 10:53:10 GMT, end at Thu 2021-10-07 04:35:03 BST. –
Oct 05 07:13:33 lxd lxd.daemon[3812]: Running destructor lxcfs_exit
Oct 05 07:13:33 lxd lxd.daemon[3812]: Running constructor lxcfs_init to reload liblxcfs
Oct 05 07:13:33 lxd lxd.daemon[3812]: mount namespace: 5
Oct 05 07:13:33 lxd lxd.daemon[3812]: hierarchies:
Oct 05 07:13:33 lxd lxd.daemon[3812]: 0: fd: 6:
Oct 05 07:13:33 lxd lxd.daemon[3812]: 1: fd: 7: name=systemd
Oct 05 07:13:33 lxd lxd.daemon[3812]: 2: fd: 8: hugetlb
Oct 05 07:13:33 lxd lxd.daemon[3812]: 3: fd: 9: net_cls,net_prio
Oct 05 07:13:33 lxd lxd.daemon[3812]: 4: fd: 10: devices
Oct 05 07:13:33 lxd lxd.daemon[3812]: 5: fd: 11: rdma
Oct 05 07:13:33 lxd lxd.daemon[3812]: 6: fd: 12: memory
Oct 05 07:13:33 lxd lxd.daemon[3812]: 7: fd: 13: pids
Oct 05 07:13:33 lxd lxd.daemon[3812]: 8: fd: 14: perf_event
Oct 05 07:13:33 lxd lxd.daemon[3812]: 9: fd: 15: cpu,cpuacct
Oct 05 07:13:33 lxd lxd.daemon[3812]: 10: fd: 16: blkio
Oct 05 07:13:33 lxd lxd.daemon[3812]: 11: fd: 17: freezer
Oct 05 07:13:33 lxd lxd.daemon[3812]: 12: fd: 19: cpuset
Oct 05 07:13:33 lxd lxd.daemon[3812]: Kernel supports pidfds
Oct 05 07:13:33 lxd lxd.daemon[3812]: Kernel does not support swap accounting
Oct 05 07:13:33 lxd lxd.daemon[3812]: api_extensions:
Oct 05 07:13:33 lxd lxd.daemon[3812]: - cgroups
Oct 05 07:13:33 lxd lxd.daemon[3812]: - sys_cpu_online
Oct 05 07:13:33 lxd lxd.daemon[3812]: - proc_cpuinfo
Oct 05 07:13:33 lxd lxd.daemon[3812]: - proc_diskstats
Oct 05 07:13:33 lxd lxd.daemon[3812]: - proc_loadavg
Oct 05 07:13:33 lxd lxd.daemon[3812]: - proc_meminfo
Oct 05 07:13:33 lxd lxd.daemon[3812]: - proc_stat
Oct 05 07:13:33 lxd lxd.daemon[3812]: - proc_swaps
Oct 05 07:13:33 lxd lxd.daemon[3812]: - proc_uptime
Oct 05 07:13:33 lxd lxd.daemon[3812]: - shared_pidns
Oct 05 07:13:33 lxd lxd.daemon[3812]: - cpuview_daemon
Oct 05 07:13:33 lxd lxd.daemon[3812]: - loadavg_daemon
Oct 05 07:13:33 lxd lxd.daemon[3812]: - pidfds
Oct 05 07:13:33 lxd lxd.daemon[3812]: Reloaded LXCFS
Oct 05 07:13:34 lxd lxd.daemon[518199]: t=2021-10-05T06:13:34+0000 lvl=eror msg=“Failed to start the daemon” err="Failed initializing storage pool “my-lvm”: V>
Oct 05 07:13:34 lxd lxd.daemon[518199]: Error: Failed initializing storage pool “my-lvm”: Volume group my-lvm not found
Oct 05 07:13:34 lxd lxd.daemon[518039]: => LXD failed to start
Oct 05 07:13:34 lxd systemd[1]: snap.lxd.daemon.service: Main process exited, code=exited, status=1/FAILURE
Oct 05 07:13:34 lxd systemd[1]: snap.lxd.daemon.service: Failed with result ‘exit-code’.
Oct 05 07:13:34 lxd systemd[1]: snap.lxd.daemon.service: Scheduled restart job, restart counter is at 856.
Oct 05 07:13:34 lxd systemd[1]: Stopped Service for snap application lxd.daemon.
Oct 05 07:13:34 lxd systemd[1]: Started Service for snap application lxd.daemon.
Oct 05 07:13:34 lxd lxd.daemon[518569]: => Preparing the system (21624)
Oct 05 07:13:34 lxd lxd.daemon[518569]: ==> Loading snap configuration
Oct 05 07:13:34 lxd lxd.daemon[518569]: ==> Setting up mntns symlink (mnt:[4026533936])

Ok, so the problem here is it not finding the my-lvm VG on this system.

What do the vgs and lvs command show?

I don’t need lvm

root@lxd:~# vgs
VG #PV #LV #SN Attr VSize VFree
vg1 1 1 0 wz–n- <1024.00g 0
root@lxd:~# lvs
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
vol1 vg1 -wi-a----- <1024.00g
root@lxd:~#

The error suggests you used to have a storage pool called my-lvm and that rather than deleting the storage pool cleanly using lxc storage deletr my-lvm the LVM volume group has been removed without LXD knowing about it.

This will prevent LXD from starting up.

Probably the easiest thing to do is to manually create an empty volume group called my-lvm, let LXD start, and then delete the storage pool cleanly using lxc storage delete my-lvm.

I’ve hit a similar problem recently (/snap/bin/lxc -v shows 4.19) :

Error: Failed initializing storage pool “zvolssd”: Thin pool not found “LXDThinPool” in volume group “zvolssd”

Checking through my notes, I used zfs to create a filesystem, then used lxc storage create to create an lvm source on that filesystem:

# apt-get install thin-provisioning-tools
# zfs create -V 50G ssd0/ssdstore/zvols
# lxc storage create zvolssd lvm source=/dev/ssd0/ssdstore/zvols

and I can check and see the volume:

# file /dev/ssd0/ssdstore/zvols 
/dev/ssd0/ssdstore/zvols: symbolic link to ../../zd0
# fdisk -l /dev/zd0
Disk /dev/zd0: 50 GiB, 53687091200 bytes, 104857600 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 8192 bytes
I/O size (minimum/optimal): 8192 bytes / 8192 bytes
# blkid /dev/zd0
/dev/zd0: UUID="RNEf4Y-TdJn-ptkx-62tr-weWG-F6Oi-8OfSo1" TYPE="LVM2_member"
# file -s /dev/zd0
/dev/zd0: LVM2 PV (Linux Logical Volume Manager), UUID: RNEf4Y-TdJn-ptkx-62tr-weWG-F6Oi-8OfSo1, size: 53687091200

Ah, this will be because since 4.19 we’ve started checking that LVM pools have their volume group and thin pools existing and active:

If you used to have an LVM pool but did not remove it via lxc storage delete <pool> but rather just deleted the backing volume group or thin pool then this will cause the issue going forward.

If you cannot temporarily restore the volume group and thinpool to allow LXD to start, then you’re going to need to use a /var/snap/lxd/common/lxd/database/patch.global.sql to repair the database manually.

So lets recreate the scenario:

lxd init --auto
lxc storage create lvm lvm
vgs
  VG  #PV #LV #SN Attr   VSize VFree
  lvm   1   1   0 wz--n- 4.65g    0 
vgremove lvm
Do you really want to remove volume group "lvm" containing 1 logical volumes? [y/n]: y
Do you really want to remove and DISCARD active logical volume lvm/LXDThinPool? [y/n]: y
  Logical volume "LXDThinPool" successfully removed
  Volume group "lvm" successfully removed
sudo systemctl reload snap.lxd.daemon
lxc ls
Error: Get "http://unix.socket/1.0": EOF
journalctl -b | grep lvm | grep Failed
Oct 08 08:02:56 v1 lxd.daemon[9851]: Error: Failed initializing storage pool "lvm": Volume group "lvm" not found

So now we need to run a database patch on LXD startup to remove the LVM pool record:

Create a file /var/snap/lxd/common/lxd/database/patch.global.sql:

DELETE FROM storage_pools WHERE name = "<pool>";

Then reload LXD:

sudo systemctl reload snap.lxd.daemon

Thanks for the quick reply, that’s exactly what I needed to do, I took a dump of global (just in case anything went wrong) with:

sqlite3 /var/snap/lxd/common/lxd/database/global/db.bin .dump > db.dump

Then after applying the patch.global.sql with the name of the errant pool, I was able to get lxd running again.

1 Like

As an aside though, this sort of handling of a missing filesystem seems to be a bit too “fail-hard” versus “fail-gracefully”? I can understand not starting the various containers that depend on the missing/failed filesystem, but the entire container ecosystem failing to start seems rather drastic?