I’m seeing an issue where its seems a snap update has failed and is not succeeding on subsequent tries. The update caused my LXD to shutdown, but snap never restarts it because the update is failing.
$ snap changes
ID Status Spawn Ready Summary
117 Error yesterday at 16:53 CDT yesterday at 17:14 CDT Auto-refresh snap "lxd"
118 Error yesterday at 19:33 CDT yesterday at 20:00 CDT Auto-refresh snap "lxd"
119 Error today at 00:43 CDT today at 01:03 CDT Auto-refresh snap "lxd"
120 Error today at 11:48 CDT today at 12:13 CDT Auto-refresh snaps "core20", "lxd"
121 Error today at 12:18 CDT today at 12:39 CDT Auto-refresh snap "lxd"
Some interesting syslog messages:
Oct 5 15:44:20 box1 systemd[1]: snap.lxd.daemon.service: Trying to enqueue job snap.lxd.daemon.service/stop/replace
Oct 5 15:44:20 box1 systemd[1]: Added job snap.lxd.daemon.service/stop to transaction.
Oct 5 15:44:20 box1 systemd[1]: snap.lxd.daemon.service: Installed new job snap.lxd.daemon.service/stop as 7378
Oct 5 15:44:20 box1 systemd[1]: snap.lxd.daemon.service: Enqueued job snap.lxd.daemon.service/stop as 7378
Oct 5 15:44:20 box1 systemd[1]: snap.lxd.daemon.service: About to execute: /usr/bin/snap run --command=stop lxd.daemon
Oct 5 15:44:20 box1 systemd[1]: snap.lxd.daemon.service: Forked /usr/bin/snap as 36905
Oct 5 15:44:20 box1 systemd[36905]: snap.lxd.daemon.service: Executing: /usr/bin/snap run --command=stop lxd.daemon
Oct 5 15:44:20 box1 systemd[1]: snap.lxd.daemon.service: Changed running -> stop
Oct 5 15:44:20 box1 systemd[1]: Stopping Service for snap application lxd.daemon...
Oct 5 15:44:20 box1 lxd.daemon[36905]: WARNING: cgroup v2 is not fully supported yet, proceeding with partial confinement
Oct 5 15:44:20 box1 lxd.daemon[36905]: => Stop reason is: snap refresh
Oct 5 15:44:20 box1 lxd.daemon[36905]: => Stopping LXD
Oct 5 15:44:21 box1 lxd.daemon[2952]: => LXD exited cleanly
Oct 5 15:44:21 box1 systemd[1]: Received SIGCHLD from PID 2952 (daemon.start).
Oct 5 15:44:21 box1 systemd[1]: Child 2952 (daemon.start) died (code=exited, status=0/SUCCESS)
Oct 5 15:44:21 box1 systemd[1]: snap.lxd.daemon.service: Child 2952 belongs to snap.lxd.daemon.service.
Oct 5 15:44:21 box1 systemd[1]: snap.lxd.daemon.service: Main process exited, code=exited, status=0/SUCCESS
Oct 5 15:44:22 box1 lxd.daemon[36905]: ==> Stopped LXD
Oct 5 15:44:22 box1 systemd[1]: systemd-journald.service: Received EPOLLHUP on stored fd 40 (stored), closing.
Oct 5 15:44:22 box1 systemd[1]: Received SIGCHLD from PID 36905 (daemon.stop).
Oct 5 15:44:22 box1 systemd[1]: Child 36905 (daemon.stop) died (code=exited, status=0/SUCCESS)
Oct 5 15:44:22 box1 systemd[1]: snap.lxd.daemon.service: Child 36905 belongs to snap.lxd.daemon.service.
Oct 5 15:44:22 box1 systemd[1]: snap.lxd.daemon.service: Control process exited, code=exited, status=0/SUCCESS
Oct 5 15:44:22 box1 systemd[1]: snap.lxd.daemon.service: Got final SIGCHLD for state stop.
Oct 5 15:44:22 box1 systemd[1]: snap.lxd.daemon.service: Succeeded.
Oct 5 15:44:22 box1 systemd[1]: snap.lxd.daemon.service: Service restart not allowed.
Oct 5 15:44:22 box1 systemd[1]: snap.lxd.daemon.service: Changed stop -> dead
Oct 5 15:44:22 box1 systemd[1]: snap.lxd.daemon.service: Job 7378 snap.lxd.daemon.service/stop finished, result=done
Oct 5 15:44:22 box1 systemd[1]: Stopped Service for snap application lxd.daemon.
Oct 5 15:44:22 box1 systemd[1]: snap.lxd.daemon.service: Control group is empty.
Oct 5 15:44:22 box1 systemd[1]: Failed to read pids.max attribute of cgroup root, ignoring: No such file or directory
Oct 5 15:44:22 box1 systemd[1]: Found unit snap.lxd.daemon.service at /etc/systemd/system/snap.lxd.daemon.service (regular file)
Oct 5 15:44:22 box1 systemd[1]: Preset files don't specify rule for snap.lxd.daemon.service. Enabling.
...
Oct 5 15:54:28 box1 snapd[2024]: taskrunner.go:271: [change 113 "Mount snap \"lxd\" (21624)" task] failed: snap-lxd-21624.mount failed to stop: timeout
Oct 5 15:54:33 box1 snapd[2024]: handlers.go:512: Reported install problem for "lxd" as 70612784-261e-11ec-be98-fa163e983629 OOPSID
...
Oct 5 17:39:09 box1 systemd[3284]: snap.lxd.lxc.2e3841b9-b8e1-467d-b70e-4bc22e49cfdd.scope: Failed to load configuration: No such file or directory
Oct 5 17:39:09 box1 systemd[3284]: snap.lxd.lxc.2e3841b9-b8e1-467d-b70e-4bc22e49cfdd.scope: Trying to enqueue job snap.lxd.lxc.2e3841b9-b8e1-467d-b70e-4bc22e49cfdd.scope/start/fail
Oct 5 17:39:09 box1 systemd[3284]: Added job snap.lxd.lxc.2e3841b9-b8e1-467d-b70e-4bc22e49cfdd.scope/start to transaction.
Oct 5 17:39:09 box1 systemd[3284]: Pulling in -.slice/start from snap.lxd.lxc.2e3841b9-b8e1-467d-b70e-4bc22e49cfdd.scope/start
Oct 5 17:39:09 box1 systemd[3284]: Added job -.slice/start to transaction.
Oct 5 17:39:09 box1 systemd[3284]: Pulling in shutdown.target/stop from snap.lxd.lxc.2e3841b9-b8e1-467d-b70e-4bc22e49cfdd.scope/start
Oct 5 17:39:09 box1 systemd[3284]: snap.lxd.lxc.2e3841b9-b8e1-467d-b70e-4bc22e49cfdd.scope: Installed new job snap.lxd.lxc.2e3841b9-b8e1-467d-b70e-4bc22e49cfdd.scope/start as 434
Oct 5 17:39:09 box1 systemd[3284]: snap.lxd.lxc.2e3841b9-b8e1-467d-b70e-4bc22e49cfdd.scope: Enqueued job snap.lxd.lxc.2e3841b9-b8e1-467d-b70e-4bc22e49cfdd.scope/start as 434
Oct 5 17:39:09 box1 systemd[3284]: Failed to read pids.max attribute of cgroup root, ignoring: No such file or directory
Oct 5 17:39:09 box1 systemd[3284]: snap.lxd.lxc.2e3841b9-b8e1-467d-b70e-4bc22e49cfdd.scope changed dead -> running
Oct 5 17:39:09 box1 systemd[3284]: snap.lxd.lxc.2e3841b9-b8e1-467d-b70e-4bc22e49cfdd.scope: Job 434 snap.lxd.lxc.2e3841b9-b8e1-467d-b70e-4bc22e49cfdd.scope/start finished, result=done
Oct 5 17:39:09 box1 systemd[3284]: Started snap.lxd.lxc.2e3841b9-b8e1-467d-b70e-4bc22e49cfdd.scope.
Oct 5 17:39:09 box1 systemd[3284]: snap.lxd.lxc.2e3841b9-b8e1-467d-b70e-4bc22e49cfdd.scope: cgroup is empty
Oct 5 17:39:09 box1 systemd[3284]: snap.lxd.lxc.2e3841b9-b8e1-467d-b70e-4bc22e49cfdd.scope: Succeeded.
Oct 5 17:39:09 box1 systemd[3284]: snap.lxd.lxc.2e3841b9-b8e1-467d-b70e-4bc22e49cfdd.scope changed running -> dead
Oct 5 17:39:09 box1 systemd[3284]: snap.lxd.lxc.2e3841b9-b8e1-467d-b70e-4bc22e49cfdd.scope: Consumed 34ms CPU time.
Oct 5 17:39:09 box1 systemd[3284]: snap.lxd.lxc.2e3841b9-b8e1-467d-b70e-4bc22e49cfdd.scope: Collecting.
...
Oct 5 17:56:52 box1 kernel: [76153.267445] audit: type=1400 audit(1633474612.900:143): apparmor="DENIED" operation="mount" info="failed flags match" error=-13 profile="lxd-devel_</var/snap/lxd/common/lxd>" name="/run/systemd/unit-root/var/cache/private/fwupdmgr/" pid=42006 comm="(fwupdmgr)" flags="rw, nosuid, remount, bind"
These seem suspicious:
Oct 5 19:38:00 box1 systemd[1]: var-snap.mount: Failed to load configuration: No such file or directory
Oct 5 19:38:00 box1 systemd[1]: var-snap-lxd.mount: Failed to load configuration: No such file or directory
Oct 5 19:38:00 box1 systemd[1]: var-snap-lxd-21497.mount: Failed to load configuration: No such file or directory
Oct 5 19:38:00 box1 systemd[1]: snap.mount: Failed to load configuration: No such file or directory
Oct 5 19:38:00 box1 systemd[1]: snap-lxd.mount: Failed to load configuration: No such file or directory
Oct 5 19:38:00 box1 systemd[1]: var-lib-snapd-snaps.mount: Failed to load configuration: No such file or directory
Oct 5 19:38:00 box1 systemd[1]: var-lib-snapd-snaps-lxd_21624.snap.mount: Failed to load configuration: No such file or directory
And lots of these in syslog about 4 a second.
Oct 5 15:44:22 box1 systemd[1]: Got message type=method_call sender=n/a destination=org.freedesktop.systemd1 path=/org/freedesktop/systemd1/unit/snap_2elxd_2edaemon_2eservice interface=org.freedesktop.DBus.Properties member=GetAll cookie=1 reply_cookie=0 signature=s error-name=n/a error-message=n/a
Oct 5 15:44:22 box1 systemd[1]: Failed to read pids.max attribute of cgroup root, ignoring: No such file or directory
Now all lxc
commands error with:
$ sudo lxc list
[sudo] password for user:
WARNING: cgroup v2 is not fully supported yet, proceeding with partial confinement
Error: Get "http://unix.socket/1.0": dial unix /var/snap/lxd/common/lxd/unix.socket: connect: connection refused
I’ve also recently enabled cgroups v2 and disabled v1 from hijacking the controllers, but apparently LXD shouldn’t care. It looks like this might be more a snapd issue than LXD, but I’m not sure. I can see that my containers are started as I see for example this process in ps output [lxc monitor] /var/snap/lxd/common/lxd/containers devel
. This is probably because they were started before LXD went down. What can I do to get LXD up and running again?