Hi,
I’m trying to debug a mount situation that happened in one of our servers, this is what I know :
We have a host “centos-lxc” that spans multiple lxd containers. The LXD package has been installed through snap.
One day when logging in, our containers went down, and when we tried to make whatever lxd call like lxc list, or even just an lxc would result in the following :
Error: Get http://unix.socket/1.0: dial unix /var/snap/lxd/common/lxd/unix.socket: connect: no such file or directory
I initially thought this issue was socket related so I tried a bunch of refreshes, and other troubleshouting steps found in many threads over here and in other places but that was not the source of the problem.
Eventually got the lxc command to respond and lxc-list to produce the following output :
An lxc list would give me the following :
[root@centoslxc ~]# lxc list
Error: Get "http://unix.socket/1.0": EOF
When investigating more intelligently, I figured that the person who initially configured the default LXD profile specified a relative path as a mount point for the storage pool, as suggested by this command :
[root@centoslxc ~]# lxd sql global 'SELECT * FROM storage_pools_config;'
±—±----------------±--------±-------±----------+
| id | storage_pool_id | node_id | key | value |
±—±----------------±--------±-------±----------+
| 3 | 3 | 1 | source | srv/lxd |
| 4 | 4 | 1 | source | srv/store |
±—±----------------±--------±-------±----------+
This most-likely conflicts with the snap utility in some obscure way that I haven’t fully understood yet, but the bottom line is this error in the logs :
Oct 31 14:55:36 centoslxc lxd.daemon[238539]: t=2020-10-31T14:55:36+0100 lvl=eror msg="Failed to start the daemon: "Failed to start the daemon: Failed initializing storage pool \"store_lxd\": Failed to mount '/var/snap/lxd/18077/srv/lxd' on '/var/snap/lxd/common/lxd/storage-pools/store_lxd': no such file or directory"
As stated lxd fails to mount the storage pool properly despite the existence of the specified directory and the corresponding containers in it as show below :
[root@centoslxc containers]# pwd
/var/snap/lxd/18077/srv/store/containers
[root@centoslxc containers]# ls
logs logsdmz logssansdmz nessus nessus2 nessus3 nessus4 nessus5 nessus6 test wifivpn wifivpn2 wifivpn3
My conclusion is that moving the containers to a storage pool mounted in an absolute directory such as /store/lxd would probably definitevely solve this issue by avoiding the weird interaction with snap, but :
I would like to know what’s the best practice in this kind of scenario to safely move out my containers to a new storage pool, without compromising their existence as I know this step will require me to delete the current storage pool and I’m afraid to lose some of them in the process.
Thanks for your support