Lots of log process and LXD hangs

Today I found that lxd cluster is not responsive to “lxc” command

and I’ve done “ps -aux | grep lxd” result was like

root      3320  2.0  1.0 2007072 61528 ?       Sl   Apr16  92:15 lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
lxd       3963  0.0  0.0  49984   400 ?        S    Apr16   0:00 dnsmasq --strict-order --bind-interfaces --pid-file=/var/snap/lxd/common/lxd/networks/vxlan0/dnsmasq.pid --except-interface=lo --interface=vxlan0 --quiet-dhcp --quiet-dhcp6 --quiet-ra --listen-address=10.101.31.1 --dhcp-no-override --dhcp-authoritative --dhcp-leasefile=/var/snap/lxd/common/lxd/networks/vxlan0/dnsmasq.leases --dhcp-hostsfile=/var/snap/lxd/common/lxd/networks/vxlan0/dnsmasq.hosts --dhcp-range 10.101.31.2,10.101.31.254,1h --listen-address=fd42:b280:d142:af7f::1 --enable-ra --dhcp-range ::,constructor:vxlan0,ra-stateless,ra-names -s lxd -S /lxd/ --conf-file=/var/snap/lxd/common/lxd/networks/vxlan0/dnsmasq.raw -u lxd
root     21352  0.0  0.3 255668 19080 ?        Sl   02:30   0:00 lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
root     21562  0.0  0.3 252948 19260 ?        Sl   02:40   0:00 lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
root     21694  0.0  0.3 255668 19468 ?        Sl   02:50   0:00 lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
root     21825  0.0  0.3 187412 19200 ?        Sl   03:00   0:00 lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
root     21974  0.0  0.3 188724 18832 ?        Sl   03:10   0:00 lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
root     22105  0.0  0.3 190132 19812 ?        Sl   03:20   0:00 lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
root     22234  0.0  0.3 255668 19388 ?        Sl   03:30   0:00 lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
root     22367  0.0  0.3 254260 19080 ?        Sl   03:40   0:00 lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
root     22516  0.0  0.3 254260 18992 ?        Sl   03:50   0:00 lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
root     22648  0.0  0.3 253204 19040 ?        Sl   04:00   0:00 lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
root     22778  0.0  0.3 255924 18928 ?        Sl   04:10   0:00 lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
root     22912  0.0  0.3 255924 19724 ?        Sl   04:20   0:00 lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
root     23044  0.0  0.3 255668 20336 ?        Sl   04:30   0:00 lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
root     23191  0.0  0.3 190132 19896 ?        Sl   04:40   0:00 lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
root     23322  0.0  0.3 252948 18624 ?        Sl   04:50   0:00 lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
root     23454  0.0  0.3 190132 18964 ?        Sl   05:00   0:00 lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
root     23588  0.0  0.3 187412 19036 ?        Sl   05:10   0:00 lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
root     23721  0.0  0.3 190132 19196 ?        Sl   05:20   0:00 lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
root     23869  0.0  0.3 255924 19044 ?        Sl   05:30   0:00 lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
root     24002  0.0  0.3 198328 18984 ?        Sl   05:40   0:00 lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
root     24133  0.0  0.3 255668 18940 ?        Sl   05:50   0:00 lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
root     24263  0.0  0.3 252948 18720 ?        Sl   06:00   0:00 lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
root     24393  0.0  0.3 252948 18732 ?        Sl   06:10   0:00 lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
root     24592  0.0  0.3 255668 19392 ?        Sl   06:20   0:00 lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
root     24779  0.0  0.3 254260 19600 ?        Sl   06:30   0:00 lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
root     24909  0.0  0.3 187412 18952 ?        Sl   06:40   0:00 lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
root     25038  0.0  0.3 187668 18948 ?        Sl   06:50   0:00 lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
root     25187  0.0  0.3 190132 19692 ?        Sl   07:00   0:00 lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
root     25372  0.0  0.3 252948 18768 ?        Sl   07:10   0:00 lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
root     25508  0.0  0.3 254260 19196 ?        Sl   07:20   0:00 lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
root     25654  0.0  0.3 254260 18808 ?        Sl   07:30   0:00 lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
root     25787  0.0  0.3 255668 19404 ?        Sl   07:40   0:00 lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
root     25918  0.0  0.3 188724 19180 ?        Sl   07:50   0:00 lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
root     26049  0.0  0.3 188980 19668 ?        Sl   08:00   0:00 lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
root     26182  0.0  0.3 255668 19984 ?        Sl   08:10   0:00 lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
root     26334  0.0  0.3 252948 19052 ?        Sl   08:20   0:00 lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
root     26464  0.0  0.3 252948 18608 ?        Sl   08:30   0:00 lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
root     26597  0.0  0.3 255668 19364 ?        Sl   08:40   0:00 lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
root     26728  0.0  0.3 187412 18560 ?        Sl   08:50   0:00 lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
root     26875  0.0  0.3 190388 18756 ?        Sl   09:00   0:00 lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
root     27007  0.0  0.3 190132 19708 ?        Sl   09:10   0:00 lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
root     27139  0.0  0.3 187668 18716 ?        Sl   09:20   0:00 lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
root     27270  0.0  0.3 252948 18964 ?        Sl   09:30   0:00 lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
root     27420  0.0  0.3 188980 18860 ?        Sl   09:40   0:00 lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
root     27552  0.0  0.3 190132 18868 ?        Sl   09:50   0:00 lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
root     27681  0.0  0.3 187668 18884 ?        Sl   10:00   0:00 lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
root     27829  0.0  0.3 255668 19584 ?        Sl   10:10   0:00 lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
root     27967  0.0  0.3 190132 18784 ?        Sl   10:20   0:00 lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
root     28097  0.0  0.3 255668 20280 ?        Sl   10:30   0:00 lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
root     28230  0.0  0.3 255668 19004 ?        Sl   10:40   0:00 lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
root     28379  0.0  0.3 252948 18760 ?        Sl   10:50   0:00 lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
root     28511  0.0  0.3 252948 19500 ?        Sl   11:00   0:00 lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
root     28662  0.0  0.3 190388 19392 ?        Sl   11:10   0:00 lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
root     28797  0.0  0.3 252948 18536 ?        Sl   11:20   0:00 lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
root     28928  0.0  0.2 252948 18268 ?        Sl   11:30   0:00 lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
root     29057  0.0  0.3 187412 18896 ?        Sl   11:40   0:00 lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
root     29206  0.0  0.3 252948 18972 ?        Sl   11:50   0:00 lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
root     29339  0.0  0.3 190132 18848 ?        Sl   12:00   0:00 lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
root     29469  0.0  0.3 189076 18436 ?        Sl   12:10   0:00 lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
root     29606  0.0  0.3 255668 19100 ?        Sl   12:20   0:00 lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
root     29738  0.0  0.3 255668 19124 ?        Sl   12:30   0:00 lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
root     29887  0.0  0.3 255668 19744 ?        Sl   12:40   0:00 lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
root     30017  0.0  0.3 253204 19308 ?        Sl   12:50   0:00 lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
root     30150  0.0  0.3 254260 18756 ?        Sl   13:00   0:00 lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
root     30298  0.0  0.3 252948 18636 ?        Sl   13:10   0:00 lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
root     30434  0.0  0.3 252948 19004 ?        Sl   13:20   0:00 lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
root     30565  0.0  0.3 252948 18816 ?        Sl   13:30   0:00 lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
root     30697  0.0  0.3 255924 18788 ?        Sl   13:40   0:00 lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
root     30846  0.0  0.3 187668 18580 ?        Sl   13:50   0:00 lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
root     30975  0.0  0.3 263864 19476 ?        Sl   14:00   0:00 lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
root     31108  0.0  0.3 255668 19752 ?        Sl   14:10   0:00 lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
root     31259  0.0  0.3 252948 18552 ?        Sl   14:20   0:00 lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
root     31393  0.0  0.3 255924 19572 ?        Sl   14:30   0:00 lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
root     31540  0.0  0.3 255668 19428 ?        Sl   14:40   0:00 lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
root     31674  0.0  0.3 188724 19680 ?        Sl   14:50   0:00 lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
root     31896  0.0  0.3 190132 20168 ?        Sl   15:00   0:00 lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
root     32028  0.0  0.3 252948 19044 ?        Sl   15:10   0:00 lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
root     32162  0.0  0.3 255668 19500 ?        Sl   15:20   0:00 lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
root     32294  0.0  0.3 255668 19624 ?        Sl   15:30   0:00 lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
root     32424  0.0  0.3 188724 19468 ?        Sl   15:40   0:00 lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
root     32496  0.0  0.0   4504  1724 ?        Ss   15:50   0:00 /bin/sh /snap/lxd/6729/commands/daemon.start
root     32572  0.0  0.3 190388 18756 ?        Sl   15:50   0:00 lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
root     32573  0.0  0.3 329656 18936 ?        Sl   15:50   0:00 lxd waitready --timeout=600
root     32658  0.0  0.0  58916  2996 ?        S    15:56   0:00 systemctl stop snap.lxd.daemon.service
root     32659  0.0  0.0   4504   848 ?        Ss   15:56   0:00 /bin/sh /snap/lxd/6729/commands/daemon.stop
user     32702  0.0  0.0  15428  2124 pts/0    S+   15:57   0:00 grep --color=auto lxd

I am using Ubuntu 17.10 and LXD installed from snap

Have any idea why this is happening?

There were only one container and one network for testing in lxd cluster

Hello, can you paste the output of journalctl -u snap.lxd.daemon ? Perhaps trimming it to the last two or three days.

You might find an entry like “Error: LXD is already running” (at least that’s something I noticed once in my test lxd cluster deployment).

It feels that somehow snapd or systemd keep spawning new processes, assuming that the old one has died.

As workaround running:

systemctl kill snap.lxd.daemon
systemctl stop snap.lxd.daemon
pkill lxd # or any other way to kill all those dangling lxd processes if any is left
systemctl start snap.lxd.daemon

might help.

It would be useful if you could also paste the content of

/var/snap/lxd/common/lxd/logs/lxd.log

possibly including some of the rotated files to (lxd.log.1, lxd.log.2.gz, etc) up to a few days ago.

Log is super long

This is the output of “sudo journalctl -xe -u snap.lxd.daemon”

-- Logs begin at Mon 2018-04-02 21:13:12 KST, end at Thu 2018-04-19 16:58:26 KST. --
Apr 19 12:02:00 node00 lxd.daemon[9030]: Error: LXD still not running after 600s timeout
Apr 19 12:02:00 node00 systemd[1]: snap.lxd.daemon.service: Main process exited, code=exited, status=1/FAILURE
Apr 19 12:02:00 node00 systemd[1]: snap.lxd.daemon.service: Unit entered failed state.
Apr 19 12:02:00 node00 systemd[1]: snap.lxd.daemon.service: Failed with result 'exit-code'.
Apr 19 12:02:00 node00 systemd[1]: snap.lxd.daemon.service: Service hold-off time over, scheduling restart.
Apr 19 12:02:00 node00 systemd[1]: Stopped Service for snap application lxd.daemon.
-- Subject: Unit snap.lxd.daemon.service has finished shutting down
-- Defined-By: systemd
-- Support: http://www.ubuntu.com/support
-- 
-- Unit snap.lxd.daemon.service has finished shutting down.
Apr 19 12:02:00 node00 systemd[1]: Started Service for snap application lxd.daemon.
-- Subject: Unit snap.lxd.daemon.service has finished start-up
-- Defined-By: systemd
-- Support: http://www.ubuntu.com/support
-- 
-- Unit snap.lxd.daemon.service has finished starting up.
-- 
-- The start-up result is done.
Apr 19 12:02:00 node00 lxd.daemon[9124]: => Preparing the system
Apr 19 12:02:00 node00 lxd.daemon[9124]: ==> Loading snap configuration
Apr 19 12:02:00 node00 lxd.daemon[9124]: ==> Setting up mntns symlink
Apr 19 12:02:00 node00 lxd.daemon[9124]: ==> Setting up kmod wrapper
Apr 19 12:02:00 node00 lxd.daemon[9124]: ==> Preparing /boot
Apr 19 12:02:00 node00 lxd.daemon[9124]: ==> Preparing a clean copy of /run
Apr 19 12:02:00 node00 lxd.daemon[9124]: ==> Preparing a clean copy of /etc
Apr 19 12:02:00 node00 lxd.daemon[9124]: ==> Setting up ceph configuration
Apr 19 12:02:00 node00 lxd.daemon[9124]: ==> Setting up LVM configuration
Apr 19 12:02:00 node00 lxd.daemon[9124]: ==> Rotating logs
Apr 19 12:02:00 node00 lxd.daemon[9124]: ==> Setting up ZFS (0.6)
Apr 19 12:02:00 node00 lxd.daemon[9124]: ==> Escaping the systemd cgroups
Apr 19 12:02:00 node00 lxd.daemon[9124]: ==> Escaping the systemd process resource limits
Apr 19 12:02:00 node00 lxd.daemon[9124]: ==> Enabling unprivileged containers kernel support
Apr 19 12:02:00 node00 lxd.daemon[5169]: mount namespace: 7
Apr 19 12:02:00 node00 lxd.daemon[5169]: hierarchies:
Apr 19 12:02:00 node00 lxd.daemon[5169]:   0: fd:   8: hugetlb
Apr 19 12:02:00 node00 lxd.daemon[5169]:   1: fd:   9: rdma
Apr 19 12:02:00 node00 lxd.daemon[5169]:   2: fd:  10: pids
Apr 19 12:02:00 node00 lxd.daemon[5169]:   3: fd:  11: freezer
Apr 19 12:02:00 node00 lxd.daemon[5169]:   4: fd:  12: net_cls,net_prio
Apr 19 12:02:00 node00 lxd.daemon[5169]:   5: fd:  13: devices
Apr 19 12:02:00 node00 lxd.daemon[5169]:   6: fd:  14: cpuset
Apr 19 12:02:00 node00 lxd.daemon[5169]:   7: fd:  15: cpu,cpuacct
Apr 19 12:02:00 node00 lxd.daemon[5169]:   8: fd:  16: memory
Apr 19 12:02:00 node00 lxd.daemon[5169]:   9: fd:  17: blkio
Apr 19 12:02:00 node00 lxd.daemon[5169]:  10: fd:  18: perf_event
Apr 19 12:02:00 node00 lxd.daemon[5169]:  11: fd:  19: name=systemd
Apr 19 12:02:00 node00 lxd.daemon[5169]:  12: fd:  20: unified
Apr 19 12:02:00 node00 lxd.daemon[5169]: lxcfs.c: 105: do_reload: lxcfs: reloaded
Apr 19 12:02:00 node00 lxd.daemon[9124]: => Re-using existing LXCFS
Apr 19 12:02:00 node00 lxd.daemon[9124]: => Starting LXD
Apr 19 12:12:00 node00 lxd.daemon[9124]: Error: LXD still not running after 600s timeout
Apr 19 12:12:00 node00 systemd[1]: snap.lxd.daemon.service: Main process exited, code=exited, status=1/FAILURE
Apr 19 12:12:00 node00 systemd[1]: snap.lxd.daemon.service: Unit entered failed state.
Apr 19 12:12:00 node00 systemd[1]: snap.lxd.daemon.service: Failed with result 'exit-code'.
Apr 19 12:12:00 node00 systemd[1]: snap.lxd.daemon.service: Service hold-off time over, scheduling restart.
Apr 19 12:12:00 node00 systemd[1]: Stopped Service for snap application lxd.daemon.
-- Subject: Unit snap.lxd.daemon.service has finished shutting down
-- Defined-By: systemd
-- Support: http://www.ubuntu.com/support
-- 
-- Unit snap.lxd.daemon.service has finished shutting down.
Apr 19 12:12:00 node00 systemd[1]: Started Service for snap application lxd.daemon.
-- Subject: Unit snap.lxd.daemon.service has finished start-up
-- Defined-By: systemd
-- Support: http://www.ubuntu.com/support
-- 
-- Unit snap.lxd.daemon.service has finished starting up.
-- 
-- The start-up result is done.
Apr 19 12:12:00 node00 lxd.daemon[9220]: => Preparing the system
Apr 19 12:12:00 node00 lxd.daemon[9220]: ==> Loading snap configuration
Apr 19 12:12:00 node00 lxd.daemon[9220]: ==> Setting up mntns symlink
Apr 19 12:12:00 node00 lxd.daemon[9220]: ==> Setting up kmod wrapper
Apr 19 12:12:00 node00 lxd.daemon[9220]: ==> Preparing /boot
Apr 19 12:12:00 node00 lxd.daemon[9220]: ==> Preparing a clean copy of /run
Apr 19 12:12:00 node00 lxd.daemon[9220]: ==> Preparing a clean copy of /etc
Apr 19 12:12:00 node00 lxd.daemon[9220]: ==> Setting up ceph configuration
Apr 19 12:12:00 node00 lxd.daemon[9220]: ==> Setting up LVM configuration
Apr 19 12:12:00 node00 lxd.daemon[9220]: ==> Rotating logs
Apr 19 12:12:00 node00 lxd.daemon[9220]: ==> Setting up ZFS (0.6)
Apr 19 12:12:00 node00 lxd.daemon[9220]: ==> Escaping the systemd cgroups
Apr 19 12:12:00 node00 lxd.daemon[9220]: ==> Escaping the systemd process resource limits
Apr 19 12:12:00 node00 lxd.daemon[9220]: ==> Enabling unprivileged containers kernel support
Apr 19 12:12:01 node00 lxd.daemon[5169]: mount namespace: 7
Apr 19 12:12:01 node00 lxd.daemon[5169]: hierarchies:
Apr 19 12:12:01 node00 lxd.daemon[5169]:   0: fd:   8: hugetlb
Apr 19 12:12:01 node00 lxd.daemon[5169]:   1: fd:   9: rdma
Apr 19 12:12:01 node00 lxd.daemon[5169]:   2: fd:  10: pids
Apr 19 12:12:01 node00 lxd.daemon[5169]:   3: fd:  11: freezer
Apr 19 12:12:01 node00 lxd.daemon[5169]:   4: fd:  12: net_cls,net_prio
Apr 19 12:12:01 node00 lxd.daemon[5169]:   5: fd:  13: devices
Apr 19 12:12:01 node00 lxd.daemon[5169]:   6: fd:  14: cpuset
Apr 19 12:12:01 node00 lxd.daemon[5169]:   7: fd:  15: cpu,cpuacct
Apr 19 12:12:01 node00 lxd.daemon[5169]:   8: fd:  16: memory
Apr 19 12:12:01 node00 lxd.daemon[5169]:   9: fd:  17: blkio
Apr 19 12:12:01 node00 lxd.daemon[5169]:  10: fd:  18: perf_event
Apr 19 12:12:01 node00 lxd.daemon[5169]:  11: fd:  19: name=systemd
Apr 19 12:12:01 node00 lxd.daemon[5169]:  12: fd:  20: unified
Apr 19 12:12:01 node00 lxd.daemon[5169]: lxcfs.c: 105: do_reload: lxcfs: reloaded
Apr 19 12:12:01 node00 lxd.daemon[9220]: => Re-using existing LXCFS
Apr 19 12:12:01 node00 lxd.daemon[9220]: => Starting LXD
Apr 19 12:22:01 node00 lxd.daemon[9220]: Error: LXD still not running after 600s timeout
Apr 19 12:22:01 node00 systemd[1]: snap.lxd.daemon.service: Main process exited, code=exited, status=1/FAILURE
Apr 19 12:22:01 node00 systemd[1]: snap.lxd.daemon.service: Unit entered failed state.
Apr 19 12:22:01 node00 systemd[1]: snap.lxd.daemon.service: Failed with result 'exit-code'.
Apr 19 12:22:01 node00 systemd[1]: snap.lxd.daemon.service: Service hold-off time over, scheduling restart.
Apr 19 12:22:01 node00 systemd[1]: Stopped Service for snap application lxd.daemon.
-- Subject: Unit snap.lxd.daemon.service has finished shutting down
-- Defined-By: systemd
-- Support: http://www.ubuntu.com/support
-- 
-- Unit snap.lxd.daemon.service has finished shutting down.
Apr 19 12:22:01 node00 systemd[1]: Started Service for snap application lxd.daemon.
-- Subject: Unit snap.lxd.daemon.service has finished start-up
-- Defined-By: systemd
-- Support: http://www.ubuntu.com/support
-- 
-- Unit snap.lxd.daemon.service has finished starting up.
-- 
-- The start-up result is done.
Apr 19 12:22:01 node00 lxd.daemon[9320]: => Preparing the system
Apr 19 12:22:01 node00 lxd.daemon[9320]: ==> Loading snap configuration
Apr 19 12:22:01 node00 lxd.daemon[9320]: ==> Setting up mntns symlink
Apr 19 12:22:01 node00 lxd.daemon[9320]: ==> Setting up kmod wrapper
Apr 19 12:22:01 node00 lxd.daemon[9320]: ==> Preparing /boot
Apr 19 12:22:01 node00 lxd.daemon[9320]: ==> Preparing a clean copy of /run
Apr 19 12:22:01 node00 lxd.daemon[9320]: ==> Preparing a clean copy of /etc
Apr 19 12:22:01 node00 lxd.daemon[9320]: ==> Setting up ceph configuration
Apr 19 12:22:01 node00 lxd.daemon[9320]: ==> Setting up LVM configuration
Apr 19 12:22:01 node00 lxd.daemon[9320]: ==> Rotating logs
Apr 19 12:22:01 node00 lxd.daemon[9320]: ==> Setting up ZFS (0.6)
Apr 19 12:22:01 node00 lxd.daemon[9320]: ==> Escaping the systemd cgroups
Apr 19 12:22:01 node00 lxd.daemon[9320]: ==> Escaping the systemd process resource limits
Apr 19 12:22:01 node00 lxd.daemon[9320]: ==> Enabling unprivileged containers kernel support
Apr 19 12:22:01 node00 lxd.daemon[5169]: mount namespace: 7
Apr 19 12:22:01 node00 lxd.daemon[5169]: hierarchies:
Apr 19 12:22:01 node00 lxd.daemon[5169]:   0: fd:   8: hugetlb
Apr 19 12:22:01 node00 lxd.daemon[5169]:   1: fd:   9: rdma
Apr 19 12:22:01 node00 lxd.daemon[5169]:   2: fd:  10: pids
Apr 19 12:22:01 node00 lxd.daemon[5169]:   3: fd:  11: freezer
Apr 19 12:22:01 node00 lxd.daemon[5169]:   4: fd:  12: net_cls,net_prio
Apr 19 12:22:01 node00 lxd.daemon[5169]:   5: fd:  13: devices
Apr 19 12:22:01 node00 lxd.daemon[5169]:   6: fd:  14: cpuset
Apr 19 12:22:01 node00 lxd.daemon[5169]:   7: fd:  15: cpu,cpuacct
Apr 19 12:22:01 node00 lxd.daemon[5169]:   8: fd:  16: memory
Apr 19 12:22:01 node00 lxd.daemon[5169]:   9: fd:  17: blkio
Apr 19 12:22:01 node00 lxd.daemon[5169]:  10: fd:  18: perf_event
Apr 19 12:22:01 node00 lxd.daemon[5169]:  11: fd:  19: name=systemd
Apr 19 12:22:01 node00 lxd.daemon[5169]:  12: fd:  20: unified
Apr 19 12:22:01 node00 lxd.daemon[5169]: lxcfs.c: 105: do_reload: lxcfs: reloaded
Apr 19 12:22:01 node00 lxd.daemon[9320]: => Re-using existing LXCFS
Apr 19 12:22:01 node00 lxd.daemon[9320]: => Starting LXD
Apr 19 12:32:01 node00 lxd.daemon[9320]: Error: LXD still not running after 600s timeout
Apr 19 12:32:01 node00 systemd[1]: snap.lxd.daemon.service: Main process exited, code=exited, status=1/FAILURE
Apr 19 12:32:01 node00 systemd[1]: snap.lxd.daemon.service: Unit entered failed state.
Apr 19 12:32:01 node00 systemd[1]: snap.lxd.daemon.service: Failed with result 'exit-code'.
Apr 19 12:32:01 node00 systemd[1]: snap.lxd.daemon.service: Service hold-off time over, scheduling restart.
Apr 19 12:32:01 node00 systemd[1]: Stopped Service for snap application lxd.daemon.
-- Subject: Unit snap.lxd.daemon.service has finished shutting down
-- Defined-By: systemd
-- Support: http://www.ubuntu.com/support
-- 
-- Unit snap.lxd.daemon.service has finished shutting down.
Apr 19 12:32:01 node00 systemd[1]: Started Service for snap application lxd.daemon.
-- Subject: Unit snap.lxd.daemon.service has finished start-up
-- Defined-By: systemd
-- Support: http://www.ubuntu.com/support
-- 
-- Unit snap.lxd.daemon.service has finished starting up.
-- 
-- The start-up result is done.
Apr 19 12:32:01 node00 lxd.daemon[9415]: => Preparing the system
Apr 19 12:32:01 node00 lxd.daemon[9415]: ==> Loading snap configuration
Apr 19 12:32:01 node00 lxd.daemon[9415]: ==> Setting up mntns symlink
Apr 19 12:32:01 node00 lxd.daemon[9415]: ==> Setting up kmod wrapper
Apr 19 12:32:01 node00 lxd.daemon[9415]: ==> Preparing /boot
Apr 19 12:32:02 node00 lxd.daemon[9415]: ==> Preparing a clean copy of /run
Apr 19 12:32:02 node00 lxd.daemon[9415]: ==> Preparing a clean copy of /etc
Apr 19 12:32:02 node00 lxd.daemon[9415]: ==> Setting up ceph configuration
Apr 19 12:32:02 node00 lxd.daemon[9415]: ==> Setting up LVM configuration
Apr 19 12:32:02 node00 lxd.daemon[9415]: ==> Rotating logs
Apr 19 12:32:02 node00 lxd.daemon[9415]: ==> Setting up ZFS (0.6)
Apr 19 12:32:02 node00 lxd.daemon[9415]: ==> Escaping the systemd cgroups
Apr 19 12:32:02 node00 lxd.daemon[9415]: ==> Escaping the systemd process resource limits
Apr 19 12:32:02 node00 lxd.daemon[9415]: ==> Enabling unprivileged containers kernel support
Apr 19 12:32:02 node00 lxd.daemon[5169]: mount namespace: 7
Apr 19 12:32:02 node00 lxd.daemon[5169]: hierarchies:
Apr 19 12:32:02 node00 lxd.daemon[5169]:   0: fd:   8: hugetlb
Apr 19 12:32:02 node00 lxd.daemon[5169]:   1: fd:   9: rdma
Apr 19 12:32:02 node00 lxd.daemon[5169]:   2: fd:  10: pids
Apr 19 12:32:02 node00 lxd.daemon[5169]:   3: fd:  11: freezer
Apr 19 12:32:02 node00 lxd.daemon[5169]:   4: fd:  12: net_cls,net_prio
Apr 19 12:32:02 node00 lxd.daemon[5169]:   5: fd:  13: devices
Apr 19 12:32:02 node00 lxd.daemon[5169]:   6: fd:  14: cpuset
Apr 19 12:32:02 node00 lxd.daemon[5169]:   7: fd:  15: cpu,cpuacct
Apr 19 12:32:02 node00 lxd.daemon[5169]:   8: fd:  16: memory
Apr 19 12:32:02 node00 lxd.daemon[5169]:   9: fd:  17: blkio
Apr 19 12:32:02 node00 lxd.daemon[5169]:  10: fd:  18: perf_event
Apr 19 12:32:02 node00 lxd.daemon[5169]:  11: fd:  19: name=systemd
Apr 19 12:32:02 node00 lxd.daemon[5169]:  12: fd:  20: unified
Apr 19 12:32:02 node00 lxd.daemon[5169]: lxcfs.c: 105: do_reload: lxcfs: reloaded
Apr 19 12:32:02 node00 lxd.daemon[9415]: => Re-using existing LXCFS
Apr 19 12:32:02 node00 lxd.daemon[9415]: => Starting LXD
Apr 19 12:42:02 node00 lxd.daemon[9415]: Error: LXD still not running after 600s timeout
Apr 19 12:42:02 node00 systemd[1]: snap.lxd.daemon.service: Main process exited, code=exited, status=1/FAILURE
Apr 19 12:42:02 node00 systemd[1]: snap.lxd.daemon.service: Unit entered failed state.
Apr 19 12:42:02 node00 systemd[1]: snap.lxd.daemon.service: Failed with result 'exit-code'.
Apr 19 12:42:02 node00 systemd[1]: snap.lxd.daemon.service: Service hold-off time over, scheduling restart.
Apr 19 12:42:02 node00 systemd[1]: Stopped Service for snap application lxd.daemon.
-- Subject: Unit snap.lxd.daemon.service has finished shutting down
-- Defined-By: systemd
-- Support: http://www.ubuntu.com/support
-- 
-- Unit snap.lxd.daemon.service has finished shutting down.
Apr 19 12:42:02 node00 systemd[1]: Started Service for snap application lxd.daemon.
-- Subject: Unit snap.lxd.daemon.service has finished start-up
-- Defined-By: systemd
-- Support: http://www.ubuntu.com/support
-- 
-- Unit snap.lxd.daemon.service has finished starting up.
-- 
-- The start-up result is done.
Apr 19 12:42:02 node00 lxd.daemon[9511]: => Preparing the system
Apr 19 12:42:02 node00 lxd.daemon[9511]: ==> Loading snap configuration
Apr 19 12:42:02 node00 lxd.daemon[9511]: ==> Setting up mntns symlink
Apr 19 12:42:02 node00 lxd.daemon[9511]: ==> Setting up kmod wrapper
Apr 19 12:42:02 node00 lxd.daemon[9511]: ==> Preparing /boot
Apr 19 12:42:02 node00 lxd.daemon[9511]: ==> Preparing a clean copy of /run
Apr 19 12:42:02 node00 lxd.daemon[9511]: ==> Preparing a clean copy of /etc
Apr 19 12:42:02 node00 lxd.daemon[9511]: ==> Setting up ceph configuration
Apr 19 12:42:02 node00 lxd.daemon[9511]: ==> Setting up LVM configuration
Apr 19 12:42:02 node00 lxd.daemon[9511]: ==> Rotating logs
Apr 19 12:42:02 node00 lxd.daemon[9511]: ==> Setting up ZFS (0.6)
Apr 19 12:42:02 node00 lxd.daemon[9511]: ==> Escaping the systemd cgroups
Apr 19 12:42:02 node00 lxd.daemon[9511]: ==> Escaping the systemd process resource limits
Apr 19 12:42:02 node00 lxd.daemon[9511]: ==> Enabling unprivileged containers kernel support
Apr 19 12:42:02 node00 lxd.daemon[5169]: mount namespace: 7
Apr 19 12:42:02 node00 lxd.daemon[5169]: hierarchies:
Apr 19 12:42:02 node00 lxd.daemon[5169]:   0: fd:   8: hugetlb
Apr 19 12:42:02 node00 lxd.daemon[5169]:   1: fd:   9: rdma
Apr 19 12:42:02 node00 lxd.daemon[5169]:   2: fd:  10: pids
Apr 19 12:42:02 node00 lxd.daemon[5169]:   3: fd:  11: freezer
Apr 19 12:42:02 node00 lxd.daemon[5169]:   4: fd:  12: net_cls,net_prio
Apr 19 12:42:02 node00 lxd.daemon[5169]:   5: fd:  13: devices
Apr 19 12:42:02 node00 lxd.daemon[5169]:   6: fd:  14: cpuset
Apr 19 12:42:02 node00 lxd.daemon[5169]:   7: fd:  15: cpu,cpuacct
Apr 19 12:42:02 node00 lxd.daemon[5169]:   8: fd:  16: memory
Apr 19 12:42:02 node00 lxd.daemon[5169]:   9: fd:  17: blkio
Apr 19 12:42:02 node00 lxd.daemon[5169]:  10: fd:  18: perf_event
Apr 19 12:42:02 node00 lxd.daemon[5169]:  11: fd:  19: name=systemd
Apr 19 12:42:02 node00 lxd.daemon[5169]:  12: fd:  20: unified
Apr 19 12:42:02 node00 lxd.daemon[5169]: lxcfs.c: 105: do_reload: lxcfs: reloaded
Apr 19 12:42:02 node00 lxd.daemon[9511]: => Re-using existing LXCFS
Apr 19 12:42:02 node00 lxd.daemon[9511]: => Starting LXD
Apr 19 12:52:02 node00 lxd.daemon[9511]: Error: LXD still not running after 600s timeout
Apr 19 12:52:02 node00 systemd[1]: snap.lxd.daemon.service: Main process exited, code=exited, status=1/FAILURE
Apr 19 12:52:02 node00 systemd[1]: snap.lxd.daemon.service: Unit entered failed state.
Apr 19 12:52:02 node00 systemd[1]: snap.lxd.daemon.service: Failed with result 'exit-code'.
Apr 19 12:52:03 node00 systemd[1]: snap.lxd.daemon.service: Service hold-off time over, scheduling restart.
Apr 19 12:52:03 node00 systemd[1]: Stopped Service for snap application lxd.daemon.
-- Subject: Unit snap.lxd.daemon.service has finished shutting down
-- Defined-By: systemd
-- Support: http://www.ubuntu.com/support
-- 
-- Unit snap.lxd.daemon.service has finished shutting down.
Apr 19 12:52:03 node00 systemd[1]: Started Service for snap application lxd.daemon.
-- Subject: Unit snap.lxd.daemon.service has finished start-up
-- Defined-By: systemd
-- Support: http://www.ubuntu.com/support
-- 
-- Unit snap.lxd.daemon.service has finished starting up.
-- 
-- The start-up result is done.
Apr 19 12:52:03 node00 lxd.daemon[9606]: => Preparing the system
Apr 19 12:52:03 node00 lxd.daemon[9606]: ==> Loading snap configuration
Apr 19 12:52:03 node00 lxd.daemon[9606]: ==> Setting up mntns symlink
Apr 19 12:52:03 node00 lxd.daemon[9606]: ==> Setting up kmod wrapper
Apr 19 12:52:03 node00 lxd.daemon[9606]: ==> Preparing /boot
Apr 19 12:52:03 node00 lxd.daemon[9606]: ==> Preparing a clean copy of /run
Apr 19 12:52:03 node00 lxd.daemon[9606]: ==> Preparing a clean copy of /etc
Apr 19 12:52:03 node00 lxd.daemon[9606]: ==> Setting up ceph configuration
Apr 19 12:52:03 node00 lxd.daemon[9606]: ==> Setting up LVM configuration
Apr 19 12:52:03 node00 lxd.daemon[9606]: ==> Rotating logs
Apr 19 12:52:03 node00 lxd.daemon[9606]: ==> Setting up ZFS (0.6)
Apr 19 12:52:03 node00 lxd.daemon[9606]: ==> Escaping the systemd cgroups
Apr 19 12:52:03 node00 lxd.daemon[9606]: ==> Escaping the systemd process resource limits
Apr 19 12:52:03 node00 lxd.daemon[9606]: ==> Enabling unprivileged containers kernel support
Apr 19 12:52:03 node00 lxd.daemon[5169]: mount namespace: 7
Apr 19 12:52:03 node00 lxd.daemon[5169]: hierarchies:
Apr 19 12:52:03 node00 lxd.daemon[5169]:   0: fd:   8: hugetlb
Apr 19 12:52:03 node00 lxd.daemon[5169]:   1: fd:   9: rdma
Apr 19 12:52:03 node00 lxd.daemon[5169]:   2: fd:  10: pids
Apr 19 12:52:03 node00 lxd.daemon[5169]:   3: fd:  11: freezer
Apr 19 12:52:03 node00 lxd.daemon[5169]:   4: fd:  12: net_cls,net_prio
Apr 19 12:52:03 node00 lxd.daemon[5169]:   5: fd:  13: devices
Apr 19 12:52:03 node00 lxd.daemon[5169]:   6: fd:  14: cpuset
Apr 19 12:52:03 node00 lxd.daemon[5169]:   7: fd:  15: cpu,cpuacct
Apr 19 12:52:03 node00 lxd.daemon[5169]:   8: fd:  16: memory
Apr 19 12:52:03 node00 lxd.daemon[5169]:   9: fd:  17: blkio
Apr 19 12:52:03 node00 lxd.daemon[5169]:  10: fd:  18: perf_event
Apr 19 12:52:03 node00 lxd.daemon[5169]:  11: fd:  19: name=systemd
Apr 19 12:52:03 node00 lxd.daemon[5169]:  12: fd:  20: unified
Apr 19 12:52:03 node00 lxd.daemon[5169]: lxcfs.c: 105: do_reload: lxcfs: reloaded
Apr 19 12:52:03 node00 lxd.daemon[9606]: => Re-using existing LXCFS
Apr 19 12:52:03 node00 lxd.daemon[9606]: => Starting LXD
Apr 19 13:02:03 node00 lxd.daemon[9606]: Error: LXD still not running after 600s timeout
Apr 19 13:02:03 node00 systemd[1]: snap.lxd.daemon.service: Main process exited, code=exited, status=1/FAILURE
Apr 19 13:02:03 node00 systemd[1]: snap.lxd.daemon.service: Unit entered failed state.
Apr 19 13:02:03 node00 systemd[1]: snap.lxd.daemon.service: Failed with result 'exit-code'.
Apr 19 13:02:03 node00 systemd[1]: snap.lxd.daemon.service: Service hold-off time over, scheduling restart.
Apr 19 13:02:03 node00 systemd[1]: Stopped Service for snap application lxd.daemon.
-- Subject: Unit snap.lxd.daemon.service has finished shutting down
-- Defined-By: systemd
-- Support: http://www.ubuntu.com/support
-- 
-- Unit snap.lxd.daemon.service has finished shutting down.
Apr 19 13:02:03 node00 systemd[1]: Started Service for snap application lxd.daemon.
-- Subject: Unit snap.lxd.daemon.service has finished start-up
-- Defined-By: systemd
-- Support: http://www.ubuntu.com/support
-- 
-- Unit snap.lxd.daemon.service has finished starting up.
-- 
-- The start-up result is done.
Apr 19 13:02:03 node00 lxd.daemon[9699]: => Preparing the system
Apr 19 13:02:03 node00 lxd.daemon[9699]: ==> Loading snap configuration
Apr 19 13:02:03 node00 lxd.daemon[9699]: ==> Setting up mntns symlink
Apr 19 13:02:03 node00 lxd.daemon[9699]: ==> Setting up kmod wrapper
Apr 19 13:02:03 node00 lxd.daemon[9699]: ==> Preparing /boot
Apr 19 13:02:03 node00 lxd.daemon[9699]: ==> Preparing a clean copy of /run
Apr 19 13:02:03 node00 lxd.daemon[9699]: ==> Preparing a clean copy of /etc
Apr 19 13:02:03 node00 lxd.daemon[9699]: ==> Setting up ceph configuration
Apr 19 13:02:03 node00 lxd.daemon[9699]: ==> Setting up LVM configuration
Apr 19 13:02:03 node00 lxd.daemon[9699]: ==> Rotating logs
Apr 19 13:02:04 node00 lxd.daemon[9699]: ==> Setting up ZFS (0.6)
Apr 19 13:02:04 node00 lxd.daemon[9699]: ==> Escaping the systemd cgroups
Apr 19 13:02:04 node00 lxd.daemon[9699]: ==> Escaping the systemd process resource limits
Apr 19 13:02:04 node00 lxd.daemon[9699]: ==> Enabling unprivileged containers kernel support
Apr 19 13:02:04 node00 lxd.daemon[5169]: mount namespace: 7
Apr 19 13:02:04 node00 lxd.daemon[5169]: hierarchies:
Apr 19 13:02:04 node00 lxd.daemon[5169]:   0: fd:   8: hugetlb
Apr 19 13:02:04 node00 lxd.daemon[5169]:   1: fd:   9: rdma
Apr 19 13:02:04 node00 lxd.daemon[5169]:   2: fd:  10: pids
Apr 19 13:02:04 node00 lxd.daemon[5169]:   3: fd:  11: freezer
Apr 19 13:02:04 node00 lxd.daemon[5169]:   4: fd:  12: net_cls,net_prio
Apr 19 13:02:04 node00 lxd.daemon[5169]:   5: fd:  13: devices
Apr 19 13:02:04 node00 lxd.daemon[5169]:   6: fd:  14: cpuset
Apr 19 13:02:04 node00 lxd.daemon[5169]:   7: fd:  15: cpu,cpuacct
Apr 19 13:02:04 node00 lxd.daemon[5169]:   8: fd:  16: memory
Apr 19 13:02:04 node00 lxd.daemon[5169]:   9: fd:  17: blkio
Apr 19 13:02:04 node00 lxd.daemon[5169]:  10: fd:  18: perf_event
Apr 19 13:02:04 node00 lxd.daemon[5169]:  11: fd:  19: name=systemd
Apr 19 13:02:04 node00 lxd.daemon[5169]:  12: fd:  20: unified
Apr 19 13:02:04 node00 lxd.daemon[5169]: lxcfs.c: 105: do_reload: lxcfs: reloaded
Apr 19 13:02:04 node00 lxd.daemon[9699]: => Re-using existing LXCFS
Apr 19 13:02:04 node00 lxd.daemon[9699]: => Starting LXD
Apr 19 13:12:04 node00 lxd.daemon[9699]: Error: LXD still not running after 600s timeout
Apr 19 13:12:04 node00 systemd[1]: snap.lxd.daemon.service: Main process exited, code=exited, status=1/FAILURE
Apr 19 13:12:04 node00 systemd[1]: snap.lxd.daemon.service: Unit entered failed state.
Apr 19 13:12:04 node00 systemd[1]: snap.lxd.daemon.service: Failed with result 'exit-code'.
Apr 19 13:12:04 node00 systemd[1]: snap.lxd.daemon.service: Service hold-off time over, scheduling restart.
Apr 19 13:12:04 node00 systemd[1]: Stopped Service for snap application lxd.daemon.

...

Same thing goes on and on

I don’t have /var/snap/lxd log left cause I’m reinstalling LXD

What happens if you do:

systemctl stop snap.lxd.daemon
lxd --debug --group lxd

This is the output

user@node01:~$ sudo systemctl stop snap.lxd.daemon
user@node01:~$ ps -aux | grep lxd
user     10664  0.0  0.0  15428  2068 pts/0    S+   18:11   0:00 grep --color=auto lxd
user@node01:~$ sudo lxd --debug --group lxd
INFO[04-20|09:11:20] LXD 3.0.0 is starting in normal mode     path=/var/snap/lxd/common/lxd
INFO[04-20|09:11:20] Kernel uid/gid map: 
INFO[04-20|09:11:20]  - u 0 0 4294967295 
INFO[04-20|09:11:20]  - g 0 0 4294967295 
INFO[04-20|09:11:20] Configured LXD uid/gid map: 
INFO[04-20|09:11:20]  - u 0 1000000 1000000000 
INFO[04-20|09:11:20]  - g 0 1000000 1000000000 
WARN[04-20|09:11:20] CGroup memory swap accounting is disabled, swap limits will be ignored. 
INFO[04-20|09:11:20] Initializing database gateway 
INFO[04-20|09:11:20] Start database node                      address=192.168.0.11:8443 id=2
INFO[04-20|09:11:20] Raft: Restored from snapshot 2-9962-1524125537555 
INFO[04-20|09:11:30] Raft: Initial configuration (index=599): [{Suffrage:Voter ID:1 Address:0} {Suffrage:Voter ID:2 Address:192.168.0.11:8443} {Suffrage:Voter ID:3 Address:192.168.0.12:8443}] 
INFO[04-20|09:11:30] Raft: Node at 192.168.0.11:8443 [Follower] entering Follower state (Leader: "") 
INFO[04-20|09:11:30] LXD isn't socket activated 
INFO[04-20|09:11:30] Starting /dev/lxd handler: 
INFO[04-20|09:11:30]  - binding devlxd socket                 socket=/var/snap/lxd/common/lxd/devlxd/sock
INFO[04-20|09:11:30] REST API daemon: 
INFO[04-20|09:11:30]  - binding Unix socket                   socket=/var/snap/lxd/common/lxd/unix.socket
INFO[04-20|09:11:30]  - binding TCP socket                    socket=192.168.0.11:8443
DBUG[04-20|09:11:30] Notify node 192.168.0.1:8443 of state changes 
DBUG[04-20|09:11:30] Notify node 192.168.0.12:8443 of state changes 
DBUG[04-20|09:11:30] Notify node 192.168.0.13:8443 of state changes 
DBUG[04-20|09:11:30] Notify node 192.168.0.14:8443 of state changes 
DBUG[04-20|09:11:30] Notify node 192.168.0.15:8443 of state changes 
DBUG[04-20|09:11:30] Notify node 192.168.0.16:8443 of state changes 
DBUG[04-20|09:11:30] Connecting to a remote LXD over HTTPs 
DBUG[04-20|09:11:30] Notify node 192.168.0.17:8443 of state changes 
DBUG[04-20|09:11:30] Notify node 192.168.0.19:8443 of state changes 
DBUG[04-20|09:11:30] Connecting to a remote LXD over HTTPs 
DBUG[04-20|09:11:30] Notify node 192.168.0.20:8443 of state changes 
DBUG[04-20|09:11:30] Connecting to a remote LXD over HTTPs 
DBUG[04-20|09:11:30] Connecting to a remote LXD over HTTPs 
DBUG[04-20|09:11:30] Connecting to a remote LXD over HTTPs 
DBUG[04-20|09:11:30] Connecting to a remote LXD over HTTPs 
DBUG[04-20|09:11:30] Connecting to a remote LXD over HTTPs 
DBUG[04-20|09:11:30] Connecting to a remote LXD over HTTPs 
DBUG[04-20|09:11:30] Notify node 192.168.0.21:8443 of state changes 
DBUG[04-20|09:11:30] Notify node 192.168.0.22:8443 of state changes 
DBUG[04-20|09:11:30] Notify node 192.168.0.23:8443 of state changes 
DBUG[04-20|09:11:30] Notify node 192.168.0.24:8443 of state changes 
DBUG[04-20|09:11:30] Notify node 192.168.0.25:8443 of state changes 
DBUG[04-20|09:11:31] Notify node 192.168.0.26:8443 of state changes 
DBUG[04-20|09:11:31] Notify node 192.168.0.27:8443 of state changes 
DBUG[04-20|09:11:31] Notify node 192.168.0.28:8443 of state changes 
DBUG[04-20|09:11:31] Notify node 192.168.0.29:8443 of state changes 
DBUG[04-20|09:11:31] Notify node 192.168.0.30:8443 of state changes 
DBUG[04-20|09:11:31] Notify node 192.168.0.31:8443 of state changes 
DBUG[04-20|09:11:31] Notify node 192.168.0.32:8443 of state changes 
DBUG[04-20|09:11:31] Notify node 192.168.0.33:8443 of state changes 
DBUG[04-20|09:11:31] Notify node 192.168.0.34:8443 of state changes 
DBUG[04-20|09:11:31] Notify node 192.168.0.35:8443 of state changes 
DBUG[04-20|09:11:31] Notify node 192.168.0.36:8443 of state changes 
DBUG[04-20|09:11:31] Notify node 192.168.0.37:8443 of state changes 
DBUG[04-20|09:11:31] Notify node 192.168.0.38:8443 of state changes 
DBUG[04-20|09:11:31] Notify node 192.168.0.39:8443 of state changes 
DBUG[04-20|09:11:31] Notify node 192.168.0.40:8443 of state changes 
DBUG[04-20|09:11:31] Notify node 192.168.0.41:8443 of state changes 
DBUG[04-20|09:11:31] Connecting to a remote LXD over HTTPs 
DBUG[04-20|09:11:31] Connecting to a remote LXD over HTTPs 
DBUG[04-20|09:11:30] Connecting to a remote LXD over HTTPs 
DBUG[04-20|09:11:31] Connecting to a remote LXD over HTTPs 
DBUG[04-20|09:11:31] Connecting to a remote LXD over HTTPs 
DBUG[04-20|09:11:31] Connecting to a remote LXD over HTTPs 
DBUG[04-20|09:11:31] Connecting to a remote LXD over HTTPs 
DBUG[04-20|09:11:31] Connecting to a remote LXD over HTTPs 
DBUG[04-20|09:11:31] Connecting to a remote LXD over HTTPs 
DBUG[04-20|09:11:31] Connecting to a remote LXD over HTTPs 
DBUG[04-20|09:11:31] Connecting to a remote LXD over HTTPs 
DBUG[04-20|09:11:31] Connecting to a remote LXD over HTTPs 
DBUG[04-20|09:11:31] Connecting to a remote LXD over HTTPs 
DBUG[04-20|09:11:31] Connecting to a remote LXD over HTTPs 
DBUG[04-20|09:11:31] Connecting to a remote LXD over HTTPs 
DBUG[04-20|09:11:31] Connecting to a remote LXD over HTTPs 
DBUG[04-20|09:11:31] Connecting to a remote LXD over HTTPs 
DBUG[04-20|09:11:31] Connecting to a remote LXD over HTTPs 
DBUG[04-20|09:11:31] Connecting to a remote LXD over HTTPs 
DBUG[04-20|09:11:31] Connecting to a remote LXD over HTTPs 
DBUG[04-20|09:11:31] Connecting to a remote LXD over HTTPs 
DBUG[04-20|09:11:31] Connecting to a remote LXD over HTTPs 
DBUG[04-20|09:11:31] Could not notify all nodes of database upgrade: failed to notify peer 192.168.0.13:8443: database upgrade notification failed: 404 Not Found 
DBUG[04-20|09:11:31] Initializing and checking storage pool "local". 
DBUG[04-20|09:11:31] Initializing a ZFS driver. 
DBUG[04-20|09:11:31] Checking ZFS storage pool "local". 
EROR[04-20|09:11:31] Failed to start the daemon: no "source" property found for the storage pool 
INFO[04-20|09:11:31] Starting shutdown sequence 
INFO[04-20|09:11:31] Stopping REST API handler: 
INFO[04-20|09:11:31]  - closing socket                        socket=192.168.0.11:8443
INFO[04-20|09:11:31]  - closing socket                        socket=/var/snap/lxd/common/lxd/unix.socket
INFO[04-20|09:11:31] Stopping /dev/lxd handler 
INFO[04-20|09:11:31]  - closing socket                        socket=/var/snap/lxd/common/lxd/devlxd/sock
INFO[04-20|09:11:31] Closing the database 
INFO[04-20|09:11:31] Stop database gateway 
INFO[04-20|09:11:31] Stop raft instance 
INFO[04-20|09:11:31] Stopping REST API handler: 
INFO[04-20|09:11:31] Stopping /dev/lxd handler 
INFO[04-20|09:11:31] Stopping REST API handler: 
INFO[04-20|09:11:31] Stopping /dev/lxd handler 
INFO[04-20|09:11:31] Unmounting temporary filesystems 
INFO[04-20|09:11:31] Done unmounting temporary filesystems 
INFO[04-20|09:11:31] Saving simplestreams cache 
INFO[04-20|09:11:31] Saved simplestreams cache 
Error: no "source" property found for the storage pool

This is strange because every node has same partition setting,

and I’ve set “source” storage to /dev/sda6.

Also I’ve checked “zpool list” and it shows “local” storage fine

After typing those two lines of commands, now this node has been disconnected from the cluster

On another running cluster node, can you run `lxd sql “SELECT * FROM storage_pools_config;”?

@freeekanayaka

This is the output

user@node02:~$ lxd sql "SELECT * FROM storage_pools_config;"
+------------+-----------------+------------+-------------------------+-----------+
| id         | storage_pool_id | node_id    | key                     | value     |
+------------+-----------------+------------+-------------------------+-----------+
| 3          | 1               | 1          | volatile.initial_source | /dev/sda6 |
| 4          | 1               | 1          | zfs.pool_name           | local     |
| 5          | 1               | 1          | source                  | local     |
| 6          | 1               | 4          | source                  | local     |
| 7          | 1               | 4          | volatile.initial_source | /dev/sda6 |
| 8          | 1               | 4          | zfs.pool_name           | local     |
| 9          | 1               | 5          | source                  | local     |
| 10         | 1               | 5          | volatile.initial_source | /dev/sda6 |
| 11         | 1               | 5          | zfs.pool_name           | local     |
| 12         | 1               | 6          | zfs.pool_name           | local     |
| 13         | 1               | 6          | source                  | local     |
| 14         | 1               | 6          | volatile.initial_source | /dev/sda6 |
| 15         | 1               | 7          | source                  | local     |
| 16         | 1               | 7          | volatile.initial_source | /dev/sda6 |
| 17         | 1               | 7          | zfs.pool_name           | local     |
| 18         | 1               | 8          | source                  | local     |
| 19         | 1               | 8          | volatile.initial_source | /dev/sda6 |
| 20         | 1               | 8          | zfs.pool_name           | local     |
| 21         | 1               | 9          | source                  | local     |
| 22         | 1               | 9          | volatile.initial_source | /dev/sda6 |
| 23         | 1               | 9          | zfs.pool_name           | local     |
| 24         | 1               | 10         | volatile.initial_source | /dev/sda6 |
| 25         | 1               | 10         | zfs.pool_name           | local     |
| 26         | 1               | 10         | source                  | local     |
| 27         | 1               | 11         | zfs.pool_name           | local     |
| 28         | 1               | 11         | source                  | local     |
| 29         | 1               | 11         | volatile.initial_source | /dev/sda6 |
| 30         | 1               | 12         | zfs.pool_name           | local     |
| 31         | 1               | 12         | source                  | local     |
| 32         | 1               | 12         | volatile.initial_source | /dev/sda6 |
| 33         | 1               | 13         | source                  | local     |
| 34         | 1               | 13         | volatile.initial_source | /dev/sda6 |
| 35         | 1               | 13         | zfs.pool_name           | local     |
| 36         | 1               | 14         | source                  | local     |
| 37         | 1               | 14         | volatile.initial_source | /dev/sda6 |
| 38         | 1               | 14         | zfs.pool_name           | local     |
| 39         | 1               | 15         | source                  | local     |
| 40         | 1               | 15         | volatile.initial_source | /dev/sda6 |
| 41         | 1               | 15         | zfs.pool_name           | local     |
| 42         | 1               | 16         | source                  | local     |
| 43         | 1               | 16         | volatile.initial_source | /dev/sda6 |
| 44         | 1               | 16         | zfs.pool_name           | local     |
| 45         | 1               | 17         | source                  | local     |
| 46         | 1               | 17         | volatile.initial_source | /dev/sda6 |
| 47         | 1               | 17         | zfs.pool_name           | local     |
| 48         | 1               | 18         | source                  | local     |
| 49         | 1               | 18         | volatile.initial_source | /dev/sda6 |
| 50         | 1               | 18         | zfs.pool_name           | local     |
| 51         | 1               | 19         | source                  | local     |
| 52         | 1               | 19         | volatile.initial_source | /dev/sda6 |
| 53         | 1               | 19         | zfs.pool_name           | local     |
| 54         | 1               | 20         | source                  | local     |
| 55         | 1               | 20         | volatile.initial_source | /dev/sda6 |
| 56         | 1               | 20         | zfs.pool_name           | local     |
| 57         | 1               | 21         | source                  | local     |
| 58         | 1               | 21         | volatile.initial_source | /dev/sda6 |
| 59         | 1               | 21         | zfs.pool_name           | local     |
| 60         | 1               | 22         | source                  | local     |
| 61         | 1               | 22         | volatile.initial_source | /dev/sda6 |
| 62         | 1               | 22         | zfs.pool_name           | local     |
| 63         | 1               | 23         | source                  | local     |
| 64         | 1               | 23         | volatile.initial_source | /dev/sda6 |
| 65         | 1               | 23         | zfs.pool_name           | local     |
| 66         | 1               | 24         | source                  | local     |
| 67         | 1               | 24         | volatile.initial_source | /dev/sda6 |
| 68         | 1               | 24         | zfs.pool_name           | local     |
| 69         | 1               | 25         | source                  | local     |
| 70         | 1               | 25         | volatile.initial_source | /dev/sda6 |
| 71         | 1               | 25         | zfs.pool_name           | local     |
| 72         | 1               | 26         | source                  | local     |
| 73         | 1               | 26         | volatile.initial_source | /dev/sda6 |
| 74         | 1               | 26         | zfs.pool_name           | local     |
| 75         | 1               | 27         | source                  | local     |
| 76         | 1               | 27         | volatile.initial_source | /dev/sda6 |
| 77         | 1               | 27         | zfs.pool_name           | local     |
| 78         | 1               | 28         | source                  | local     |
| 79         | 1               | 28         | volatile.initial_source | /dev/sda6 |
| 80         | 1               | 28         | zfs.pool_name           | local     |
| 81         | 1               | 29         | source                  | local     |
| 82         | 1               | 29         | volatile.initial_source | /dev/sda6 |
| 83         | 1               | 29         | zfs.pool_name           | local     |
| 84         | 1               | 30         | source                  | local     |
| 85         | 1               | 30         | volatile.initial_source | /dev/sda6 |
| 86         | 1               | 30         | zfs.pool_name           | local     |
| 87         | 1               | 31         | source                  | local     |
| 88         | 1               | 31         | volatile.initial_source | /dev/sda6 |
| 89         | 1               | 31         | zfs.pool_name           | local     |
| 90         | 1               | 32         | source                  | local     |
| 91         | 1               | 32         | volatile.initial_source | /dev/sda6 |
| 92         | 1               | 32         | zfs.pool_name           | local     |
+------------+-----------------+------------+-------------------------+-----------+

I’ve issued command from “node02” which is another node.
This cluster has 32 nodes in total (node00 ~ node31), but node08 is broken(So not part of LXD cluster now)

Please can you also run ‘lxd sql “SELECT * FROM nodes;"’ ?

From node02 or whatever other node is online and healty.

And “lxc cluster list” as well.

So from what I understand the current situation is:

  • The original problem you reported (lxd cluster is not responsive to “lxc” command) was related to a former node00 of the first cluster you built. That problem is now gone since you probably rebuilt a second cluster from scratch (if this is the case, please next time post a separate topic to the forum to avoid confusion).

  • You now have effectively 31 nodes in this new cluster, because you removed node08, I guess with “lxc cluster remove node08” (correct?)

  • Out of these 31 nodes, 30 nodes are working fine.

  • There is one node (node01) that currently fails to start with Error: no “source” property found for the storage pool

The reason of the failure is that some entries are missing in the database (as shown by the query that @stgraber asked you to run).

We might be able to recover the situation by manually adding those missing entries, but it’d be better to understand how we ended up in this inconsistent situation. What was the history of node01 in this second cluster you built? (If you can remember it, such as was it working before, did you do some change on it or in the cluster that we can correlate to the failure we see now, etc.)

Appending to this topic also all logs of node01 (that you can find in /var/snap/lxd/common/lxd/logs) would help to. Or even better, make a tarball of the /var/snap/lxd/common/lxd/logs directories of all the current 31 nodes, and attach it here.

Thanks

Sorry for making things complex

At initial cluster, I had 31 nodes
(node00 ~ node31 composing a LXD cluster, without node08)

But someday LXD cluster wasn’t responsive to lxc commands then

I’ve found that there were lots of “lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd” processes
on one of the cluster nodes (node01)

So I’ve killed all lxd processes, removed LXD with snap, removed snap on all nodes and reinstalled LXD cluster again

After reinstallation all 31 nodes in cluster are operational and working.

Seems like snap logs of incident in /var/snap/lxd/common/lxd/logs folder is gone due to reinstallation

If same problem occurs again, I’ll make sure to take records and report back here

Part of the issue is being tracked here:

although we’ll still need to understand the root cause of the problem.