yesterday i restarted my system an my containers weren’t reachable.
So i wanted to have a look at them - but the command “lxc list” fails with the message:
The lxd deamon (installed via snap) and the unix socket seems to run (systemctl status say that they were active).
Is it possible that is a similar problem like this one?
At the moment i don’t know what i can try next - i don’t want to wreck my system because there is a productive Webserver with several sites an a database server…
I looked at several similar prolems - but they all have other Error messages in their Threads.
I had some problem last week and was able to bypass it by reverting to the previous lxd version (I’ using the snap version)
Now it seems I do not have this previous version, which worked OK anymore, only two 3.14 based revisions
sudo snap list --all lxd
Name Version Rev Tracking Publisher Notes
lxd 3.14 10934 stable canonical✓ disabled
lxd 3.14 10972 stable canonical✓ -
my logs say this:
2019-06-27T11:09:57Z lxd.daemon[5737]: t=2019-06-27T12:09:57+0100 lvl=eror msg="Failed to mount DIR storage pool \"/var/lib/snapd/hostfs/home/lxd/storage-pools/default\" onto \"/var/snap/lxd/common/lxd/storage-pools/bigdisk\": no such file or directory"
2019-06-27T11:09:57Z lxd.daemon[5737]: t=2019-06-27T12:09:57+0100 lvl=eror msg="Failed to start the daemon: no such file or directory"
2019-06-27T11:09:57Z lxd.daemon[5737]: Error: no such file or directory
2019-06-27T11:09:58Z lxd.daemon[5737]: => LXD failed to start
This is AFTER I have already manually created the directories reported as not existing, as was suggested in
t=2019-06-27T13:30:45+0200 lvl=info msg=“Applying patch: storage_api_rename_container_snapshots_dir_again”
t=2019-06-27T13:30:53+0200 lvl=eror msg=“Failed to start the daemon: rename /var/snap/lxd/common/lxd/storage-pools/default/snapshots/dbserver/snap0 /var/snap/lxd/common/lxd/storage-pools/default/containers-snapshots/dbserver/snap0: file exists”
i did read the whole other Thread (LXD 3.14 on snap fails) …but i simply don’t know what to do now because my message is a little bit different.
What confuses me absolutely is that /var/snap/lxd/common/lxd/storage-pools/default is completely empty. But in the above error line there is the message: file exists…
STOP…this was in lxd.log.1
In lxd.log there now is no error message - but lxc list gives me the same error:
Error: Get http://unix.socket/1.0: EOF
sudo mount /var/snap/lxd/common/lxd/disks/default.img /mnt
sudo ls -lh /mnt/snapshots
sudo ls -lh /mnt/snapshots/*
sudo ls -lh /mnt/containers-snapshots
sudo ls -lh /mnt/containers-snapshots/*
The error suggests you have a container which exists in both, making the migration from one to the other impossible. Once we’ve confirmed that’s the case, check if either the source or target snapshot is empty and blow away whichever is empty, then start LXD again.
i checked dmesg after mounting the file and it says:
[68987.422847] BTRFS info (device loop0): disk space caching is enabled
[68987.422851] BTRFS info (device loop0): has skinny extents
before 2 days we had a power loss - and i think this could have caused a damage of the btrfs.
this could be the problem that the migration could not be started i think…
read-only is perfectly normal, those btrfs subvolumes are marked read-only, so you can’t delete/move them the normal way. LXD has logic that handles that, it just can’t do anything if it finds a snapshot existing in both old and new path, that needs manual resolving first.
Assuming that this doesn’t complain about the read-only property and that those paths are indeed gone after running those commands as root, try starting LXD and it should be able to move the rest of the data over.
Note that you’ll want to check that I didn’t miss any of them or you’re still going to hit the same issue.
Ah, then maybe the old entries are not subvolumes, try to blow them away with a good old sudo rm -Rf /mnt/def_ori/containers-snapshots/dbserver/snap0 then, that may work.