Hey I’m having trouble running my cluster. When I logged in they were stuck I couldn’t run lxc list or anything so I tried to do a refresh and it seems like lxd was stuck on an “auto-refresh”
So I did a reboot and I’m unable to get my cluster to respond I’m just getting timeout EOF on unix.socket. How can I recover?
I tried a few more things to get some debug messages
bastion@studio-prime-33kl:~$ sudo /snap/bin/lxd --debug --group lxd
DBUG[12-07|16:26:55] Connecting to a local LXD over a Unix socket
DBUG[12-07|16:26:55] Sending request to LXD method=GET url=http://unix.socket/1.0 etag=
INFO[12-07|16:27:06] LXD 3.18 is starting in normal mode path=/var/snap/lxd/common/lxd
INFO[12-07|16:27:06] Kernel uid/gid map:
INFO[12-07|16:27:06] - u 0 0 4294967295
INFO[12-07|16:27:06] - g 0 0 4294967295
INFO[12-07|16:27:06] Configured LXD uid/gid map:
INFO[12-07|16:27:06] - u 0 1000000 1000000000
INFO[12-07|16:27:06] - g 0 1000000 1000000000
WARN[12-07|16:27:06] CGroup memory swap accounting is disabled, swap limits will be ignored.
INFO[12-07|16:27:06] Kernel features:
INFO[12-07|16:27:06] - netnsid-based network retrieval: no
INFO[12-07|16:27:06] - uevent injection: yes
INFO[12-07|16:27:06] - seccomp listener: no
INFO[12-07|16:27:06] - unprivileged file capabilities: yes
INFO[12-07|16:27:06] - shiftfs support: no
INFO[12-07|16:27:06] Initializing local database
DBUG[12-07|16:27:06] Initializing database gateway
DBUG[12-07|16:27:06] Start database node id=1 address=10.0.1.3:8443
DBUG[12-07|16:27:06] Connecting to a local LXD over a Unix socket
DBUG[12-07|16:27:06] Sending request to LXD method=GET url=http://unix.socket/1.0 etag=
WARN[12-07|16:27:12] Dqlite client proxy Unix -> TLS: read unix @->@0004d: use of closed network connection
DBUG[12-07|16:27:20] Detected stale unix socket, deleting
DBUG[12-07|16:27:20] Detected stale unix socket, deleting
INFO[12-07|16:27:20] Starting /dev/lxd handler:
INFO[12-07|16:27:20] - binding devlxd socket socket=/var/snap/lxd/common/lxd/devlxd/sock
INFO[12-07|16:27:20] REST API daemon:
INFO[12-07|16:27:20] - binding Unix socket socket=/var/snap/lxd/common/lxd/unix.socket
INFO[12-07|16:27:20] - binding TCP socket socket=10.0.1.3:8443
INFO[12-07|16:27:20] Initializing global database
DBUG[12-07|16:27:20] Found cert name=0
DBUG[12-07|16:27:20] Dqlite: server connection failed err=failed to establish network connection: 503 Service Unavailable address=10.0.1.3:8443 id=1 attempt=0
DBUG[12-07|16:27:20] Dqlite: server connection failed err=failed to establish network connection: 503 Service Unavailable address=10.0.1.4:8443 id=1 attempt=0
DBUG[12-07|16:27:20] Dqlite: server connection failed err=failed to establish network connection: 503 Service Unavailable address=10.0.1.5:8443 id=1 attempt=0
DBUG[12-07|16:27:20] Dqlite: connection failed err=no available dqlite leader server found attempt=0
DBUG[12-07|16:27:20] Found cert name=0
DBUG[12-07|16:27:20] Dqlite: server connection failed err=failed to establish network connection: 503 Service Unavailable address=10.0.1.3:8443 id=1 attempt=1
DBUG[12-07|16:27:20] Dqlite: server connection failed err=failed to establish network connection: 503 Service Unavailable address=10.0.1.4:8443 id=1 attempt=1
DBUG[12-07|16:27:20] Dqlite: server connection failed err=failed to establish network connection: 503 Service Unavailable address=10.0.1.5:8443 id=1 attempt=1
DBUG[12-07|16:27:20] Dqlite: connection failed err=no available dqlite leader server found attempt=1
DBUG[12-07|16:27:20] Found cert name=0
DBUG[12-07|16:27:20] Dqlite: server connection failed err=failed to establish network connection: 503 Service Unavailable address=10.0.1.3:8443 id=1 attempt=2
DBUG[12-07|16:27:20] Dqlite: server connection failed err=failed to establish network connection: 503 Service Unavailable address=10.0.1.4:8443 id=1 attempt=2
DBUG[12-07|16:27:21] Dqlite: server connection failed err=failed to establish network connection: 503 Service Unavailable address=10.0.1.5:8443 id=1 attempt=2
DBUG[12-07|16:27:21] Dqlite: connection failed err=no available dqlite leader server found attempt=2
DBUG[12-07|16:27:21] Found cert name=0
DBUG[12-07|16:27:21] Dqlite: server connection failed err=failed to establish network connection: 503 Service Unavailable address=10.0.1.3:8443 id=1 attempt=3
DBUG[12-07|16:27:21] Dqlite: server connection failed err=failed to establish network connection: 503 Service Unavailable address=10.0.1.4:8443 id=1 attempt=3
DBUG[12-07|16:27:21] Dqlite: server connection failed err=failed to establish network connection: 503 Service Unavailable address=10.0.1.5:8443 id=1 attempt=3
DBUG[12-07|16:27:21] Dqlite: connection failed err=no available dqlite leader server found attempt=3
DBUG[12-07|16:27:21] Found cert name=0
DBUG[12-07|16:27:22] Found cert name=0
DBUG[12-07|16:27:22] Found cert name=0
DBUG[12-07|16:27:22] Dqlite: server connection failed err=failed to establish network connection: 503 Service Unavailable address=10.0.1.3:8443 id=1 attempt=4
DBUG[12-07|16:27:22] Dqlite: server connection failed err=failed to establish network connection: 503 Service Unavailable address=10.0.1.4:8443 id=1 attempt=4
DBUG[12-07|16:27:22] Dqlite: server connection failed err=failed to establish network connection: 503 Service Unavailable address=10.0.1.5:8443 id=1 attempt=4
DBUG[12-07|16:27:22] Dqlite: connection failed err=no available dqlite leader server found attempt=4
DBUG[12-07|16:27:22] Found cert name=0
DBUG[12-07|16:27:22] Found cert name=0
DBUG[12-07|16:27:22] Found cert name=0
DBUG[12-07|16:27:23] Found cert name=0
DBUG[12-07|16:27:23] Found cert name=0
DBUG[12-07|16:27:23] Dqlite: server connection failed err=failed to establish network connection: 503 Service Unavailable address=10.0.1.3:8443 id=1 attempt=5
DBUG[12-07|16:27:23] Dqlite: server connection failed err=failed to establish network connection: 503 Service Unavailable address=10.0.1.4:8443 id=1 attempt=5
DBUG[12-07|16:27:23] Dqlite: server connection failed err=failed to establish network connection: 503 Service Unavailable address=10.0.1.5:8443 id=1 attempt=5
DBUG[12-07|16:27:23] Dqlite: connection failed err=no available dqlite leader server found attempt=5
DBUG[12-07|16:27:23] Found cert name=0
DBUG[12-07|16:27:23] Found cert name=0
DBUG[12-07|16:27:24] Found cert name=0
DBUG[12-07|16:27:24] Found cert name=0
DBUG[12-07|16:27:24] Dqlite: server connection failed err=failed to establish network connection: 503 Service Unavailable address=10.0.1.3:8443 id=1 attempt=6
DBUG[12-07|16:27:26] Found cert name=0
WARN[12-07|16:27:26] Dqlite client proxy Unix -> TLS: read unix @->@00055: use of closed network connection
WARN[12-07|16:27:26] Dqlite client proxy Unix -> TLS: read unix @->@00050: use of closed network connection
WARN[12-07|16:27:26] Dqlite server proxy Unix -> TLS: read unix @->@0004b: use of closed network connection
DBUG[12-07|16:27:26] Dqlite: server connection failed err=failed to establish network connection: Head https://10.0.1.4:8443/internal/database: dial tcp 10.0.1.4:8443: connect: connection refused address=10.0.1.4:8443 id=1 attempt=6
DBUG[12-07|16:27:26] Dqlite: server connection failed err=failed to establish network connection: 503 Service Unavailable address=10.0.1.5:8443 id=1 attempt=6
DBUG[12-07|16:27:26] Dqlite: connection failed err=no available dqlite leader server found attempt=6
DBUG[12-07|16:27:27] Found cert name=0
DBUG[12-07|16:27:27] Found cert name=0
DBUG[12-07|16:27:27] Found cert name=0
DBUG[12-07|16:27:27] Dqlite: server connection failed err=failed to establish network connection: 503 Service Unavailable address=10.0.1.3:8443 id=1 attempt=7
DBUG[12-07|16:27:27] Dqlite: server connection failed err=failed to establish network connection: Head https://10.0.1.4:8443/internal/database: dial tcp 10.0.1.4:8443: connect: connection refused address=10.0.1.4:8443 id=1 attempt=7
DBUG[12-07|16:27:28] Dqlite: server connection failed err=failed to establish network connection: 503 Service Unavailable address=10.0.1.5:8443 id=1 attempt=7
DBUG[12-07|16:27:28] Dqlite: connection failed err=no available dqlite leader server found attempt=7
DBUG[12-07|16:27:28] Found cert name=0
DBUG[12-07|16:27:28] Found cert name=0
DBUG[12-07|16:27:29] Found cert name=0
DBUG[12-07|16:27:29] Found cert name=0
DBUG[12-07|16:27:29] Dqlite: server connection failed err=failed to establish network connection: 503 Service Unavailable address=10.0.1.3:8443 id=1 attempt=8
DBUG[12-07|16:27:29] Dqlite: server connection failed err=failed to establish network connection: 503 Service Unavailable address=10.0.1.4:8443 id=1 attempt=8
DBUG[12-07|16:27:29] Dqlite: server connection failed err=failed to establish network connection: 503 Service Unavailable address=10.0.1.5:8443 id=1 attempt=8
DBUG[12-07|16:27:29] Dqlite: connection failed err=no available dqlite leader server found attempt=8
DBUG[12-07|16:27:29] Found cert name=0
DBUG[12-07|16:27:29] Found cert name=0
DBUG[12-07|16:27:29] Found cert name=0
DBUG[12-07|16:27:30] Found cert name=0
DBUG[12-07|16:27:30] Found cert name=0
DBUG[12-07|16:27:30] Dqlite: server connection failed err=failed to establish network connection: 503 Service Unavailable address=10.0.1.3:8443 id=1 attempt=9
DBUG[12-07|16:27:30] Dqlite: server connection failed err=failed to establish network connection: 503 Service Unavailable address=10.0.1.4:8443 id=1 attempt=9
DBUG[12-07|16:27:30] Dqlite: server connection failed err=failed to establish network connection: 503 Service Unavailable address=10.0.1.5:8443 id=1 attempt=9
DBUG[12-07|16:27:30] Dqlite: connection failed err=no available dqlite leader server found attempt=9
DBUG[12-07|16:27:30] Found cert name=0
DBUG[12-07|16:27:31] Failed connecting to global database (attempt 0): failed to create dqlite connection: no available dqlite leader server found
DBUG[12-07|16:27:31] Found cert name=0
DBUG[12-07|16:27:31] Found cert name=0
DBUG[12-07|16:27:32] Found cert name=0
DBUG[12-07|16:27:33] Found cert name=0
DBUG[12-07|16:27:33] Dqlite: server connection failed err=failed to establish network connection: 503 Service Unavailable address=10.0.1.3:8443 id=1 attempt=0
DBUG[12-07|16:27:33] Dqlite: server connection failed err=failed to establish network connection: 503 Service Unavailable address=10.0.1.4:8443 id=1 attempt=0
DBUG[12-07|16:27:33] Dqlite: server connection failed err=failed to establish network connection: 503 Service Unavailable address=10.0.1.5:8443 id=1 attempt=0
DBUG[12-07|16:27:33] Dqlite: connection failed err=no available dqlite leader server found attempt=0
DBUG[12-07|16:27:33] Found cert name=0
DBUG[12-07|16:27:33] Dqlite: server connection failed err=failed to establish network connection: 503 Service Unavailable address=10.0.1.3:8443 id=1 attempt=1
DBUG[12-07|16:27:33] Dqlite: server connection failed err=failed to establish network connection: 503 Service Unavailable address=10.0.1.4:8443 id=1 attempt=1
DBUG[12-07|16:27:33] Dqlite: server connection failed err=failed to establish network connection: 503 Service Unavailable address=10.0.1.5:8443 id=1 attempt=1
DBUG[12-07|16:27:33] Dqlite: connection failed err=no available dqlite leader server found attempt=1
DBUG[12-07|16:27:33] Found cert name=0
DBUG[12-07|16:27:33] Dqlite: server connection failed err=failed to establish network connection: 503 Service Unavailable address=10.0.1.3:8443 id=1 attempt=2
DBUG[12-07|16:27:33] Dqlite: server connection failed err=failed to establish network connection: 503 Service Unavailable address=10.0.1.4:8443 id=1 attempt=2
DBUG[12-07|16:27:33] Dqlite: server connection failed err=failed to establish network connection: 503 Service Unavailable address=10.0.1.5:8443 id=1 attempt=2
DBUG[12-07|16:27:33] Dqlite: connection failed err=no available dqlite leader server found attempt=2
DBUG[12-07|16:27:33] Found cert name=0
DBUG[12-07|16:27:34] Found cert name=0
DBUG[12-07|16:27:34] Dqlite: server connection failed err=failed to establish network connection: 503 Service Unavailable address=10.0.1.3:8443 id=1 attempt=3
DBUG[12-07|16:27:34] Dqlite: server connection failed err=failed to establish network connection: 503 Service Unavailable address=10.0.1.4:8443 id=1 attempt=3
DBUG[12-07|16:27:34] Dqlite: server connection failed err=failed to establish network connection: 503 Service Unavailable address=10.0.1.5:8443 id=1 attempt=3
DBUG[12-07|16:27:34] Dqlite: connection failed err=no available dqlite leader server found attempt=3
DBUG[12-07|16:27:34] Found cert name=0
DBUG[12-07|16:27:34] Found cert name=0
DBUG[12-07|16:27:34] Found cert name=0
DBUG[12-07|16:27:34] Found cert name=0
DBUG[12-07|16:27:35] Found cert name=0
DBUG[12-07|16:27:35] Dqlite: server connection failed err=failed to establish network connection: 503 Service Unavailable address=10.0.1.3:8443 id=1 attempt=4
DBUG[12-07|16:27:35] Dqlite: server connection failed err=failed to establish network connection: 503 Service Unavailable address=10.0.1.4:8443 id=1 attempt=4
DBUG[12-07|16:27:35] Dqlite: server connection failed err=failed to establish network connection: 503 Service Unavailable address=10.0.1.5:8443 id=1 attempt=4
DBUG[12-07|16:27:35] Dqlite: connection failed err=no available dqlite leader server found attempt=4
DBUG[12-07|16:27:35] Found cert name=0
Segmentation fault
Hey I have resolved the issue it was one of the nodes was not getting upgraded to 3.18 causing them all to fail. Once I figured out which node was the problem I did a refresh and it all seems to be fine now.