Hi, I’ve been running into very consistent errors when having an LXD (5.0.2) cluster. The base of the issue is that eventually 1-2 of the nodes out of the 3 nodes in my cluster eventually run into what seems to be a database issue where
lxc list on other nodes shows the containers are created but that they are in an ‘ERROR’ state.
lxc list also does not work on the node thats failed and the only way to fix it is with
systemctl restart snap.lxd.daemon. Also
lxc cluster list shows all nodes as ‘healthy’ during this time but obviously thats not the case.
I have created a GH issue for this as well as more logs around the time that it happens. Unfortunately its making me not love clustering anymore due to these recurring issues