Hi Again,
I’m using LXD 3.16 installed by Snap on Ubuntu 18.04
I have a very strange issue with a node (Ceph1) from my three node test cluster. I get the following error:
lxc list
Error: Get http://unix.socket/1.0: dial unix /var/lib/lxd/unix.socket: connect: connection refused: connection refused
It is peculiar as the server is serving a container, from another node I can see it.
lxc list
+-------+---------+----------------------+------+------------+-----------+----------+
| NAME | STATE | IPV4 | IPV6 | TYPE | SNAPSHOTS | LOCATION |
+-------+---------+----------------------+------+------------+-----------+----------+
| jump1 | RUNNING | 100.65.20.235 (eno1) | | PERSISTENT | 1 | lxd01 |
+-------+---------+----------------------+------+------------+-----------+----------+
| jump2 | RUNNING | 100.65.20.239 (eno1) | | PERSISTENT | 0 | lxd02 |
+-------+---------+----------------------+------+------------+-----------+----------+
| rad1 | RUNNING | 100.65.20.238 (eno1) | | PERSISTENT | 0 | lxd02 |
+-------+---------+----------------------+------+------------+-----------+----------+
| rad2 | RUNNING | 100.65.20.237 (eno1) | | PERSISTENT | 1 | ceph1 |
+-------+---------+----------------------+------+------------+-----------+----------+
The LXD process is running on Ceph1. Here are the logs from /var/snap/lxd/common/lxd/logs/lxd.log:
t=2019-08-28T09:59:39+0000 lvl=info msg="Expiring log files"
t=2019-08-28T09:59:39+0000 lvl=info msg="Done expiring log files"
t=2019-08-28T09:59:39+0000 lvl=info msg="Updating instance types"
t=2019-08-28T09:59:39+0000 lvl=info msg="Done updating instance types"
t=2019-08-28T09:59:39+0000 lvl=info msg="Updating images"
t=2019-08-28T09:59:39+0000 lvl=info msg="Done updating images"
t=2019-08-28T09:59:39+0000 lvl=info msg="Starting container" action=start created=2019-08-27T11:21:40+0000 ephemeral=false name=rad2 project=default stateful=false used=2019-08-28T09:58:23+0000
t=2019-08-28T09:59:39+0000 lvl=info msg="Started container" action=start created=2019-08-27T11:21:40+0000 ephemeral=false name=rad2 project=default stateful=false used=2019-08-28T09:58:23+0000
t=2019-08-28T09:59:43+0000 lvl=eror msg="Failed to get leader node address: context deadline exceeded"
t=2019-08-28T09:59:45+0000 lvl=warn msg="Excluding offline node from refresh: {ID:3 Address:100.65.20.231:8443 RaftID:3 Raft:true LastHeartbeat:2019-08-28 09:58:58.746788356 +0000 UTC Online:false updated:false}"
The user calling the lxc commands is in the lxd user group too.
I’m not sure where to go from here. Any ideas?
Thanks again