Ok, so now I am on ver 3.20.and I have been using LXD since version 1 so while not an expert and I am getting pretty good at making this work. Or at the much needed skill of reinstalling everything and reinstalling again.
After running a test for several weeks, here is what I found. Every time, I would bring down more than 3 database servers, or more specifically shutdown the whole cluster, I would need several restart to get it going. Version 3.20 is much better than previous ones, but it still has issues.
Today I boot up the servers again and it is stuck, restarting it wont do it. They are all dead as door nails. Luckily these units are for testing for now. I am reinstalling the whole cluster again… and of course it says all data will be lost. It isn’t a problem since this is a test unit. But obviously all this is not acceptable in a production unit. I have 4 other server running LXD 3.18 and I don’t reboot those in fear. Because it will turn into weeks of unnecessary work.
So the Question is again…Why do we have to lose the data in zfs every time a member is added, why can’t cluster members be added and removed at will without destroying the data?