Unable to restart lxd deamon after degraded zfs raidz2 array

Hi
The zfs raidz2 array on which the lxc storage pool are have been degraded. Since when I’m not able to restart the lxd deamon.

Containers are online but I can’t get logged in?

Here is some logs:

command: journalctl -u snap.lxd.daemon -n 300

May 06 10:48:37 host01.example.com systemd[1]: Started Service for snap application lxd.daemon.
May 06 10:48:37 host01.example.com lxd.daemon[4017555]: => Preparing the system (28463)
May 06 10:48:37 host01.example.com lxd.daemon[4017555]: ==> Loading snap configuration
May 06 10:48:37 host01.example.com lxd.daemon[4017555]: ==> Setting up mntns symlink (mnt:[4026533079])
May 06 10:48:37 host01.example.com lxd.daemon[4017555]: ==> Setting up kmod wrapper
May 06 10:48:37 host01.example.com lxd.daemon[4017555]: ==> Preparing /boot
May 06 10:48:37 host01.example.com lxd.daemon[4017555]: ==> Preparing a clean copy of /run
May 06 10:48:37 host01.example.com lxd.daemon[4017555]: ==> Preparing /run/bin
May 06 10:48:37 host01.example.com lxd.daemon[4017555]: ==> Preparing a clean copy of /etc
May 06 10:48:37 host01.example.com lxd.daemon[4017555]: ==> Preparing a clean copy of /usr/share/misc
May 06 10:48:37 host01.example.com lxd.daemon[4017555]: ==> Setting up ceph configuration
May 06 10:48:37 host01.example.com lxd.daemon[4017555]: ==> Setting up LVM configuration
May 06 10:48:37 host01.example.com lxd.daemon[4017555]: ==> Setting up OVN configuration
May 06 10:48:37 host01.example.com lxd.daemon[4017555]: ==> Rotating logs
May 06 10:48:37 host01.example.com lxd.daemon[4017555]: ==> Unsupported ZFS version (0.8)
May 06 10:48:37 host01.example.com lxd.daemon[4017555]: ==> Escaping the systemd cgroups
May 06 10:48:37 host01.example.com lxd.daemon[4017555]: ====> Detected cgroup V1
May 06 10:48:37 host01.example.com lxd.daemon[4017555]: ==> Escaping the systemd process resource limits
May 06 10:48:37 host01.example.com lxd.daemon[4017555]: ==> Enabling LXD UI
May 06 10:48:37 host01.example.com lxd.daemon[4017555]: ==> Exposing LXD documentation
May 06 10:48:37 host01.example.com lxd.daemon[4017555]: => Re-using existing LXCFS
May 06 10:48:37 host01.example.com lxd.daemon[4017555]: ==> Reloading LXCFS
May 06 10:48:37 host01.example.com lxd.daemon[4017555]: ==> Cleaning up existing LXCFS namespace
May 06 10:48:38 host01.example.com lxd.daemon[3912]: Closed liblxcfs.so
May 06 10:48:38 host01.example.com lxd.daemon[3912]: Running destructor lxcfs_exit
May 06 10:48:38 host01.example.com lxd.daemon[3912]: Running constructor lxcfs_init to reload liblxcfs
May 06 10:48:38 host01.example.com lxd.daemon[3912]: mount namespace: 5
May 06 10:48:38 host01.example.com lxd.daemon[3912]: hierarchies:
May 06 10:48:38 host01.example.com lxd.daemon[3912]:   0: fd:   6:
May 06 10:48:38 host01.example.com lxd.daemon[3912]:   1: fd:   7: name=systemd
May 06 10:48:38 host01.example.com lxd.daemon[3912]:   2: fd:   8: cpuset
May 06 10:48:38 host01.example.com lxd.daemon[3912]:   3: fd:   9: rdma
May 06 10:48:38 host01.example.com lxd.daemon[3912]:   4: fd:  10: perf_event
May 06 10:48:38 host01.example.com lxd.daemon[3912]:   5: fd:  11: cpu,cpuacct
May 06 10:48:38 host01.example.com lxd.daemon[3912]:   6: fd:  12: memory
May 06 10:48:38 host01.example.com lxd.daemon[3912]:   7: fd:  13: devices
May 06 10:48:38 host01.example.com lxd.daemon[3912]:   8: fd:  14: net_cls,net_prio
May 06 10:48:38 host01.example.com lxd.daemon[3912]:   9: fd:  15: hugetlb
May 06 10:48:38 host01.example.com lxd.daemon[3912]:  10: fd:  16: blkio
May 06 10:48:38 host01.example.com lxd.daemon[3912]:  11: fd:  17: pids
May 06 10:48:38 host01.example.com lxd.daemon[3912]:  12: fd:  19: freezer
May 06 10:48:38 host01.example.com lxd.daemon[3912]: Kernel supports pidfds
May 06 10:48:38 host01.example.com lxd.daemon[3912]: Kernel does not support swap accounting
May 06 10:48:38 host01.example.com lxd.daemon[3912]: api_extensions:
May 06 10:48:38 host01.example.com lxd.daemon[3912]: - cgroups
May 06 10:48:38 host01.example.com lxd.daemon[3912]: - sys_cpu_online
May 06 10:48:38 host01.example.com lxd.daemon[3912]: - proc_cpuinfo
May 06 10:48:38 host01.example.com lxd.daemon[3912]: - proc_diskstats
May 06 10:48:38 host01.example.com lxd.daemon[3912]: - proc_loadavg
May 06 10:48:38 host01.example.com lxd.daemon[3912]: - proc_meminfo
May 06 10:48:38 host01.example.com lxd.daemon[3912]: - proc_stat
May 06 10:48:38 host01.example.com lxd.daemon[3912]: - proc_swaps
May 06 10:48:38 host01.example.com lxd.daemon[3912]: - proc_uptime
May 06 10:48:38 host01.example.com lxd.daemon[3912]: - proc_slabinfo
May 06 10:48:38 host01.example.com lxd.daemon[3912]: - shared_pidns
May 06 10:48:38 host01.example.com lxd.daemon[3912]: - cpuview_daemon
May 06 10:48:38 host01.example.com lxd.daemon[3912]: - loadavg_daemon
May 06 10:48:38 host01.example.com lxd.daemon[3912]: - pidfds
May 06 10:48:38 host01.example.com lxd.daemon[3912]: Reloaded LXCFS
May 06 10:48:38 host01.example.com lxd.daemon[4017555]: => Starting LXD
May 06 10:48:38 host01.example.com lxd.daemon[4018244]: time="2024-05-06T10:48:38-04:00" level=warning msg=" - Couldn't find the CGroup blkio.weight, disk priority will be ignored"
May 06 10:48:38 host01.example.com lxd.daemon[4018244]: time="2024-05-06T10:48:38-04:00" level=warning msg=" - Couldn't find the CGroup memory swap accounting, swap limits will be ignored"
May 06 10:48:41 host01.example.com lxd.daemon[4018244]: time="2024-05-06T10:48:41-04:00" level=error msg="Failed loading storage pool" err="Required tool 'zpool' is missing" pool=default
May 06 10:48:41 host01.example.com lxd.daemon[4018244]: time="2024-05-06T10:48:41-04:00" level=error msg="Failed loading storage pool" err="Required tool 'zpool' is missing" pool=ssd
May 06 10:48:41 host01.example.com lxd.daemon[4018244]: time="2024-05-06T10:48:41-04:00" level=error msg="Failed to start the daemon" err="Failed applying patch \"storage_move_custom_iso_block_volumes_v2\": Failed loading pool \"default\":>
May 06 10:48:41 host01.example.com lxd.daemon[4018244]: time="2024-05-06T10:48:41-04:00" level=warning msg="Failed to advertise vsock address to instance agent" err="Failed sending VM sock address to lxd-agent: Failed to fetch https://cust>
May 06 10:48:41 host01.example.com lxd.daemon[4018244]: Error: Failed applying patch "storage_move_custom_iso_block_volumes_v2": Failed loading pool "default": Required tool 'zpool' is missing
May 06 10:48:42 host01.example.com lxd.daemon[4017555]: Killed
May 06 10:48:42 host01.example.com lxd.daemon[4017555]: => LXD failed to start
May 06 10:48:42 host01.example.com systemd[1]: snap.lxd.daemon.service: Main process exited, code=exited, status=1/FAILURE
May 06 10:48:42 host01.example.com systemd[1]: snap.lxd.daemon.service: Failed with result 'exit-code'.
May 06 10:48:42 host01.example.com systemd[1]: snap.lxd.daemon.service: Scheduled restart job, restart counter is at 121989.
May 06 10:48:42 host01.example.com systemd[1]: Stopped Service for snap application lxd.daemon.

I found the problem. It was not related with the degraded state of the array (obviously I replace the faulty disk). It was related to the version of zfs as in this issue:
LXD 5.12: ZFS stopped working in lxd - Error: Required tool ‘zpool’ is missing when kernel ZFS module version < 0.8
I upgraded the OS from ubuntu 20.04 to 22.04 and everything works fine.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.