Cluster nodes getting in state where instances are created but are in error state

Hi, I’ve been running into very consistent errors when having an LXD (5.0.2) cluster. The base of the issue is that eventually 1-2 of the nodes out of the 3 nodes in my cluster eventually run into what seems to be a database issue where lxc list on other nodes shows the containers are created but that they are in an ‘ERROR’ state. lxc list also does not work on the node thats failed and the only way to fix it is with systemctl restart snap.lxd.daemon. Also lxc cluster list shows all nodes as ‘healthy’ during this time but obviously thats not the case.

I have created a GH issue for this as well as more logs around the time that it happens. Unfortunately its making me not love clustering anymore due to these recurring issues

Please can you describe more about your setup (hardware, network etc).

Also please can you provide the output of lxc cluster list before and after the issues start.
Please can you also explain what is happening in the cluster when the problem starts, or whether there are particular timings until the problems start.

Finally, please can you explain about your cluster member names, they look a little odd. Is this just so they line up with the host name of the servers or have you added/removed cluster members in the past?