I’m trying to delete the default storage to use another disk, separated from /var, for my servers (cluster of two servers called server and server2). The first problem is some volume are still present on the storage, although all containers and images have been erased.
% lxc storage delete default
Error: storage pool "default" has volumes attached to it
Indeed, something went wrong while deleting all containers and their snapshots
% lxc storage volume list default
+----------------------+---------------------+-------------+---------+----------+
| TYPE | NAME | DESCRIPTION | USED BY | LOCATION |
+----------------------+---------------------+-------------+---------+----------+
| container (snapshot) | ntp-backup/working | | 1 | server |
+----------------------+---------------------+-------------+---------+----------+
| container (snapshot) | template/2019051201 | | 1 | server2 |
+----------------------+---------------------+-------------+---------+----------+
But I cannot delete theses snapshots.
% lxc storage volume delete default ntp-backup/working
Error: No such object
% lxc storage volume delete default template/2019051201
Error: No such object
I’m using lxd 3.13
% lxc --version
3.13
Back-end storage is BTRFS.
Thank you in advance for your help.
It works for me at least; now I’m not doing more advanced tests of this kind on my own disk but given that you want to get rid of it anyway I think that trying out
‘subvolume delete’ should do what you want.
root@server:/var/snap/lxd/common/lxd/storage-pools/default# btrfs subvolume delete containers-snapshots/ntp-backup
ERROR: not a subvolume: containers-snapshots/ntp-backup
Although it is listed as subvolume
% sudo nsenter -t $(pgrep daemon.start) -m -- /snap/lxd/current/bin/btrfs subvolume list /var/snap/lxd/common/lxd/storage-pools/default
ID 420 gen 2712846 top level 5 path snap/lxd/common/lxd/storage-pools/default
ID 421 gen 2712846 top level 420 path containers
ID 422 gen 2712961 top level 420 path containers-snapshots
ID 423 gen 2712846 top level 420 path images
ID 424 gen 2712846 top level 420 path custom
ID 425 gen 2712846 top level 420 path custom-snapshots
ID 510 gen 2643133 top level 422 path containers-snapshots/ntp-backup/working
And the subvolume is in readonly mode
% sudo btrfs subvolume show /var/snap/lxd/common/lxd/storage-pools/default/containers-snapshots/ntp-backup/working
snap/lxd/common/lxd/storage-pools/default/containers-snapshots/ntp-backup/working
Name: working
UUID: 6e3f2cbf-14fc-2742-8ff5-a24ce137966a
Parent UUID: a2159e66-f131-6c46-907f-7958cce550a2
Received UUID: -
Creation time: 2019-05-15 09:37:15 +0200
Subvolume ID: 510
Generation: 2643133
Gen at creation: 2643133
Parent ID: 422
Top level ID: 422
Flags: readonly
I guess something terribly wrong happened when some snapshots were deleted.
no the problem is that I said ‘brtfs subvolume delete’ and assumed that you would replace in the command I gave you list by delete. Instead you entered btrfs subvolume delete directly and omitted the nsenter command. This nsenter stuff is essential with snap since in this case the storage is mapped only for the lxd process, not your user process. So reenter the btrfs subvolume delete with all the nsenter incantation and it should work better.
I’m not sure what happens here. I’d have expected that by running sudo nsenter you would have inherited the root powers of the lxd process. Maybe try to delete directly ntp-backup ? or even adding another sudo before the btrfs command ?
% sudo nsenter -t $(pgrep daemon.start) -m -- sudo /snap/lxd/current/bin/btrfs subvolume delete /var/snap/lxd/common/lxd/storage-pools/default/containers-snapshots/ntp-backup/working
[sudo] password for clement:
sudo: unable to stat /etc/sudoers: No such file or directory
sudo: no valid sudoers sources found, quitting
sudo: unable to initialize policy plugin
Removing data directly returns a bunch of error messages
got it I think.
sudo nsenter -t $(pgrep daemon.start) -m – ls -l /var/snap/lxd/common/lxd/storage-pools/default/containers-snapshots/ntp-backup
should show you that the snapshot has a ‘+’ displayed showing that it has an ACL set. I think that using getfacl and setfacl -b should get you to the light (do not forget to use nsenter)
does it work ? if yes, is still btrfs subvolume delete returning no perm ? If yes, probably the storage must be read only and the perm error is a bad error message.
Maybe try btrfs scrub then (still with nsenter of course)
Or possibly restart lxd with sudo snap restart lxd. Maybe it’s as simple as that (it would be a bug of course). Try this first.
% sudo nsenter -t $(pgrep daemon.start) -m -- chmod g+rw,o+rw /var/snap/lxd/common/lxd/storage-pools/default/containers-snapshots/ntp-backup/working
chmod: changing permissions of '/var/snap/lxd/common/lxd/storage-pools/default/containers-snapshots/ntp-backup/working': Read-only file system
btrfs scrub didn’t return any error
% sudo nsenter -t $(pgrep daemon.start) -m -- /snap/lxd/current/bin/btrfs scrub start -B /var/snap/lxd/common/lxd/storage-pools/default/containers-snapshots/ntp-backup/working
WARNING: cannot create scrub data file, mkdir /var/lib/btrfs failed: Read-only file system. Status recording disabled
WARNING: failed to open the progress status socket at /var/lib/btrfs/scrub.progress.de67eea8-b6fc-40c8-bc0e-55f293f95
77e: No such file or directory. Progress cannot be queried
scrub done for de67eea8-b6fc-40c8-bc0e-55f293f9577e
scrub started at Wed May 22 09:31:01 2019 and finished after 00:00:30
total bytes scrubbed: 4.05GiB with 0 errors
So, I restarted lxd
% sudo snap restart lxd
Restarted.
Try to delete the volume again from lxd without success
% lxc storage volume delete default ntp-backup/working
Error: No such object
% lxc storage volume list default
+----------------------+---------------------+-------------+---------+----------+
| TYPE | NAME | DESCRIPTION | USED BY | LOCATION |
+----------------------+---------------------+-------------+---------+----------+
| container (snapshot) | ntp-backup/working | | 1 | server |
+----------------------+---------------------+-------------+---------+----------+
| container (snapshot) | template/2019051201 | | 1 | server2 |
+----------------------+---------------------+-------------+---------+----------+
I did the same on server2 for the other snapshot. The subvolume are indeed gone, but still listed in the storage of lxd. So, I cannot delete the storage default
% lxc storage delete default
Error: storage pool "default" has volumes attached to it
I tried but it doesn’t update the storage status. I also tried to stop lxd on both servers at the same time, then start again, but it’s not working either.
oh, yuck. I’m pretty sure that it worked for me.
At this point, I am at a loss for rational answers. Maybe restart computers ? Or trying to be a bit Conan-the-Barbarian with lxc storage edit default ???
Conan-the-Barbarian it is. I exported all my containers, removed lxd on both servers and removed all subvolumes and /var/snap/lxd folders. I also removed the partition corresponding to my second storage. I reinstalled lxd without defining a storage and added my own afterward. It is ok now. I don’t know what happened.
For the record, looking at LXD code, I think that this wasl not bad but not sufficient:
// Delete the mountpoint.
if shared.PathExists(customSubvolumeName) {
err = os.Remove(customSubvolumeName)
if err != nil {
return err
}
}
err = s.s.Cluster.StoragePoolVolumeDelete(
"default",
s.volume.Name,
storagePoolVolumeTypeCustom,
s.poolID)
if err != nil {
logger.Errorf(`Failed to delete database entry for BTRFS storage volume "%s" on storage pool "%s"`, s.volume.Name, s.pool.Name)
}
Deleting the mountpoint and the sqlite database entry (when using a cluster as it is the case for you) were necessary as well. More difficult than I thought, maybe it worked for me because I’m not using clusters.