That was the original problem. All my nodes “auto” refreshed at once. During that time, my primary and secondary DNS servers were knocked offline, and then the others failed.
So, a chain of events caused a complete failure of all nodes to update. Now, even if I hardcode 8.8.8.8 DNS, and then run the refresh, I still can’t get snap to start lxc.
It just hangs with: level=warning msg="Wait for other cluster nodes to upgrade their versions, cluster not started yet"
It looks like refresh of the nodes is actually stuck mid refresh…hmmm
root@nuc-server-3:~# snap changes
ID Status Spawn Ready Summary
82 Done today at 16:20 UTC today at 16:20 UTC Running service command
83 Done today at 16:21 UTC today at 16:21 UTC Running service command
84 Done today at 16:36 UTC today at 16:36 UTC Change configuration of "core" snap
85 Done today at 16:36 UTC today at 16:36 UTC Change configuration of "core" snap
86 Undone today at 16:36 UTC today at 17:36 UTC Refresh "lxd" snap
87 Done today at 17:40 UTC today at 17:40 UTC Running service command
88 Done today at 17:41 UTC today at 17:41 UTC Running service command
89 Done today at 17:42 UTC today at 17:42 UTC Change configuration of "core" snap
90 Done today at 17:42 UTC today at 17:42 UTC Change configuration of "core" snap
91 Undone today at 17:44 UTC today at 19:16 UTC Refresh "lxd" snap
92 Done today at 17:57 UTC today at 17:57 UTC Change configuration of "core" snap
93 Done today at 17:57 UTC today at 17:57 UTC Change configuration of "core" snap
94 Done today at 18:42 UTC today at 18:45 UTC Auto-refresh snaps "core20", "snapd"
95 Done today at 19:22 UTC today at 20:22 UTC Refresh "lxd" snap
96 Done today at 20:43 UTC today at 20:52 UTC Revert "lxd" snap
97 Done today at 20:56 UTC today at 20:56 UTC Running service command
98 Done today at 20:56 UTC today at 20:56 UTC Running service command
99 Done today at 20:57 UTC today at 20:57 UTC Running service command
100 Done today at 21:16 UTC today at 21:16 UTC Running service command
101 Done today at 22:11 UTC today at 22:11 UTC Running service command
102 Doing today at 22:19 UTC - Refresh "lxd" snap
-- Subject: A stop job for unit snap.lxd.daemon.service has finished
-- Defined-By: systemd
-- Support: http://www.ubuntu.com/support
--
-- A stop job for unit snap.lxd.daemon.service has finished.
--
-- The job identifier is 2831 and the job result is done.
Jun 02 04:29:57 nuc-server-2 systemd[1]: Started Service for snap application lxd.daemon.
-- Subject: A start job for unit snap.lxd.daemon.service has finished successfully
-- Defined-By: systemd
-- Support: http://www.ubuntu.com/support
--
-- A start job for unit snap.lxd.daemon.service has finished successfully.
--
-- The job identifier is 2831.
Jun 02 04:29:57 nuc-server-2 lxd.daemon[4478]: => Preparing the system (23155)
Jun 02 04:29:57 nuc-server-2 lxd.daemon[4478]: ==> Loading snap configuration
Jun 02 04:29:57 nuc-server-2 lxd.daemon[4478]: ==> Setting up mntns symlink (mnt:[4026532328])
Jun 02 04:29:57 nuc-server-2 lxd.daemon[4478]: ==> Setting up kmod wrapper
Jun 02 04:29:57 nuc-server-2 lxd.daemon[4478]: ==> Preparing /boot
Jun 02 04:29:57 nuc-server-2 lxd.daemon[4478]: ==> Preparing a clean copy of /run
Jun 02 04:29:57 nuc-server-2 lxd.daemon[4478]: ==> Preparing /run/bin
Jun 02 04:29:57 nuc-server-2 lxd.daemon[4478]: ==> Preparing a clean copy of /etc
Jun 02 04:29:57 nuc-server-2 lxd.daemon[4478]: ==> Preparing a clean copy of /usr/share/misc
Jun 02 04:29:57 nuc-server-2 lxd.daemon[4478]: ==> Setting up ceph configuration
Jun 02 04:29:57 nuc-server-2 lxd.daemon[4478]: ==> Setting up LVM configuration
Jun 02 04:29:57 nuc-server-2 lxd.daemon[4478]: ==> Setting up OVN configuration
Jun 02 04:29:57 nuc-server-2 lxd.daemon[4478]: ==> Rotating logs
Jun 02 04:29:57 nuc-server-2 lxd.daemon[4598]: error: Compressing program wrote following message to stderr when compressing log /var/snap/lxd/common/lxd/logs/lxd.log.1:
Jun 02 04:29:57 nuc-server-2 lxd.daemon[4598]: gzip: stdin: warning: file timestamp out of range for gzip format
Jun 02 04:29:57 nuc-server-2 lxd.daemon[4598]: error: failed to compress log /var/snap/lxd/common/lxd/logs/lxd.log.1
Jun 02 04:29:57 nuc-server-2 systemd[1]: snap.lxd.daemon.service: Main process exited, code=exited, status=1/FAILURE
I moved the logs out and it started up finally: mv /var/snap/lxd/common/lxd/logs/lxd.log* logs/