The troublesome LXD cluster saga continues:
all other cluster members see this:
ubuntu@aa1-cptef101-n3:~$ lxc cluster ls
+-----------------+--------------------------+----------+--------+-------------------+--------------+
| NAME | URL | DATABASE | STATE | MESSAGE |
ARCHITECTURE |
+-----------------+--------------------------+----------+--------+-------------------+--------------+
| aa1-cptef101-n1 | https://10.224.1.11:8443 | NO | ONLINE | fully operational | x86_64
|
+-----------------+--------------------------+----------+--------+-------------------+--------------+
| aa1-cptef101-n2 | https://10.224.1.12:8443 | NO | ONLINE | fully operational | x86_64
|
+-----------------+--------------------------+----------+--------+-------------------+--------------+
| aa1-cptef101-n3 | https://10.224.1.13:8443 | YES | ONLINE | fully operational |
x86_64 |
+-----------------+--------------------------+----------+--------+-------------------+--------------+
| aa1-cptef101-n4 | https://10.224.1.14:8443 | YES | ONLINE | fully operational |
x86_64 |
+-----------------+--------------------------+----------+--------+-------------------+--------------+
| aa1-cptef102-n1 | https://10.224.1.21:8443 | YES | ONLINE | fully operational |
x86_64 |
+-----------------+--------------------------+----------+--------+-------------------+--------------+
| aa1-cptef102-n2 | https://10.224.1.22:8443 | YES | ONLINE | fully operational |
x86_64 |
+-----------------+--------------------------+----------+--------+-------------------+--------------+
| aa1-cptef102-n3 | https://10.224.1.23:8443 | NO | ONLINE | fully operational | x86_64
|
+-----------------+--------------------------+----------+--------+-------------------+--------------+
| aa1-cptef102-n4 | https://10.224.1.24:8443 | NO | ONLINE | fully operational | x86_64
|
+-----------------+--------------------------+----------+--------+-------------------+--------------+
aa1-cptef101-n2 sees this:
root@aa1-cptef101-n2:~# lxc cluster ls
±----------------±-------------------------±---------±--------±---------------------------------±------------
-+
| NAME | URL | DATABASE | STATE | MESSAGE |
ARCHITECTURE |
±----------------±-------------------------±---------±--------±---------------------------------±------------
-+
| aa1-cptef101-n1 | https://10.224.1.11:8443 | NO | OFFLINE | no heartbeat since
27.552492624s | x86_64 |
±----------------±-------------------------±---------±--------±---------------------------------±------------
-+
| aa1-cptef101-n2 | https://10.224.1.12:8443 | NO | OFFLINE | no heartbeat since
27.553664584s | x86_64 |
±----------------±-------------------------±---------±--------±---------------------------------±------------
-+
| aa1-cptef101-n3 | https://10.224.1.13:8443 | YES | OFFLINE | no heartbeat since
27.553363184s | x86_64 |
±----------------±-------------------------±---------±--------±---------------------------------±------------
-+
| aa1-cptef101-n4 | https://10.224.1.14:8443 | YES | OFFLINE | no heartbeat since
27.553172404s | x86_64 |
±----------------±-------------------------±---------±--------±---------------------------------±------------
-+
| aa1-cptef102-n1 | https://10.224.1.21:8443 | YES | OFFLINE | no heartbeat since
27.552999854s | x86_64 |
±----------------±-------------------------±---------±--------±---------------------------------±------------
-+
| aa1-cptef102-n2 | https://10.224.1.22:8443 | YES | OFFLINE | no heartbeat since
27.552827674s | x86_64 |
±----------------±-------------------------±---------±--------±---------------------------------±------------
-+
| aa1-cptef102-n3 | https://10.224.1.23:8443 | NO | OFFLINE | no heartbeat since
27.552713144s | x86_64 |
±----------------±-------------------------±---------±--------±---------------------------------±------------
-+
| aa1-cptef102-n4 | https://10.224.1.24:8443 | NO | OFFLINE | no heartbeat since
27.552588354s | x86_64 |
±----------------±-------------------------±---------±--------±---------------------------------±------------
-+
aa1-cptef101-n2 will intermittently see the rest of the cluster, but will immediately fall back to thinking the rest of the cluster is offline, while this happens the other cluster members will show it as online.
The weird part is we can still exec and edit configs (which take effect) so it’s not a complete desync.
I will happily provide more info - I am just not immediately sure what more info is necessary.