How to fix Error: Get "http://unix.socket/1.0": dial unix /var/snap/lxd/common/lxd/unix.socket: connect: connection refused?

Loulourge · August 20, 2021, 9:27am

Hi,

I’m new here and this is my first post on this forum.

I’m a first year student in computer science and i recently setup for the first time a LXD environnement in the enterprise when i work.

In fact this environnement serve to host somes sites in production.

Explanation of my problem :

In first, i had not a problem before, all working fine but recently, an error appear suddently without me touching anything when i use lxc command (for exemple lxc ls).

This is this error : Error: Get "http://unix.socket/1.0": EOF

Of course, i have searched for a solution on the forums but I am afraid that some solutions turn off my LXD service without me being able to restart and thus make the production sites inaccessible.

Concretely, i would like to correct this error while being sure not to stop my LXD service.

I have tried this manipulation : lxd --debug --group lxd

2 errors observed :

Failed to start the daemon: Failed initializing storage pool “default”: Failed to mount “/dev/loop0” on “/var/snap/lxd/common/lxd/storage-pools/default” using “btrfs”: file exists

Error: Failed initializing storage pool “default”: Failed to mount “/dev/loop0” on “/var/snap/lxd/common/lxd/storage-pools/default” using “btrfs”: file exists

From memory i had this new error : Error: Get http://unix.socket/1.0: dial unix /var/snap/lxd/common/lxd/unix.socket: connect: no such file or directory

After that I went to /var/snap/lxd/common/lxd and the file unix.socket was not here. (why ?)

I decided to create this file manualy and after that a new error appear : Error: Get "http://unix.socket/1.0": dial unix /var/snap/lxd/common/lxd/unix.socket: connect: connection refused

The next day the file mysteriously disappeared by itself but this last error remains the same.

I see this topic but i’m not sure if this the correct way to fix this : Best way to solve unproper mounting issue in LXD (snap related) - #9 by Kcold

Since i don’t know the cause of the problem, i don’t know how to reproduce it in another environment and thus test different manipulations without being afraid to break everything.

I was thinking of restarting snap.lxd.daemon.unix.socket and see what happens but as i said I want to be sure not to stop the LXD service.

To conclude, I would like to have clarifications on the reason of my problem, how to avoid its recurrence and how to solve it without fearing to make the production sites inaccessible.

I rarely post problems on forums but here i have to find my problem as soon as possible (urgent) because i use lxd/lxc commands obviously to manage containers but also to backup them.

This is important for me because i am in a company and this problem is really blocking.

I hope someone can help me because I am still a novice and i have no one who can help me now.

Thanks in advance !

PS : sorry if my english is bad, this is not my first langage.

Environnement :

Debian 10
lxd 4.17 (snap)

tomp · August 20, 2021, 9:31am

This sounds similar, if not identical to Problem with lxc: Failed initializing storage pool "default" - #21 by khalid_mrabti

The issue is with the “Failed initializing storage pool” errors, so that is what needs resolving.

I suspect there has been a problem with the snap refresh (cc @stgraber ) getting mounts confused and a reboot seemed to fix this for the earlier reported issue.

Loulourge · August 20, 2021, 9:45am

Hi @tomp,

Thank you for your quick response !

Will the ip of the containers stay the same after the server restart ?

I hope it will work because if the sites are no longer accessible, i hope the lxc commands will work to restart them.

tomp · August 20, 2021, 9:48am

So the containers are still running?

If you want you can initiate the server restart at a more convenient time. As for IPs changing, the mount issue won’t affect IPs changing - so if they weren’t going to change after a normal reboot then they won’t change after this reboot (but that all depends on your specific configuration).

Loulourge · August 20, 2021, 10:11am

Yes they still running.

I don’t know what information you want in my configuration but with lxd init i use the default options.

I realized that i should have linked to the databases via the names of the containers (name.lxd) and not via the IP, that’s why i’m wondering.

tomp · August 20, 2021, 10:13am

That feels like a good idea to start a separate post about that, as its not really got anything to do with this thread about getting LXD started, and more about how to configure LXD to use static IP addresses.

tomp · August 20, 2021, 10:14am

That is likely the same issue as the other thread I linked to had. During snap refresh the mount namespace seems to lose track that the pool is already mounted and tries to mount it again, even though it is mounted because the other containers are running.

It is likely causing this check to incorrectly return false, and trigger a mount attempt.

https://github.com/lxc/lxd/blob/master/lxd/storage/drivers/driver_btrfs.go#L295-L298

@stgraber have you seen anything like this before?