LXD snapshot not deleted after expiry date

Hi,

Some of my containers have all their snapshots that are not deleted automatically after their expiration date (possible to delete manually). I am using /snap/bin/lxd.lxc config set c1 snapshots.expiry "7d" to make my snapshots in a bash script.

The expiry dates for each container are mentioned in the column “EXPIRES AT”. The command lxc config get c1 snapshots.expiry also shows me the correct expiration date.

The strange thing is that on another server that hosts LXD of the same version, there is no such problem.

Also note that there is a snapshot of a container that I can’t delete manually with lxc delete c1/c1-09-04-2022. Here is the error:

Error: Failed setting subvolume writable “/var/snap/lxd/common/lxd/storage-pools/default/containers-snapshots/c1/c1-09-04-2022”: Failed to run: btrfs property set -ts /var/snap/lxd/common/lxd/storage-pools/default/containers-snapshots/c1/c1-09-04-2022 ro false: ERROR: Could not set subvolume flags: Read-only file system

Thank you in advance for your help.

LXD version (snap) : 5.0.0
Host OS : Debian 10
Containers OS : Debian 10/11

Are you seeing any error in lxd.log?
I wonder if it’s trying to delete that same snapshot, fails and then never gets to delete anything else.

No I didn’t look. Here are its contents :

time=“2022-04-20T04:59:06Z” level=warning msg=" - Couldn’t find the CGroup blkio.weight, disk priority will be ignored"
time=“2022-04-20T04:59:06Z” level=warning msg=" - Couldn’t find the CGroup hugetlb controller, hugepage limits will be ignored"
time=“2022-04-20T04:59:06Z” level=warning msg=" - Couldn’t find the CGroup memory swap accounting, swap limits will be ignored"
time=“2022-04-20T04:59:09Z” level=warning msg=“Failed to initialize fanotify, falling back on fsnotify” err=“Failed to initialize fanotify: invalid argument”
time=“2022-04-30T11:59:29Z” level=warning msg=“Transaction timed out. Retrying once” err=“failed to begin transaction: context deadline exceeded” member=1

Okay, nothing very obvious there.

Can you run lxc monitor --type=logging --pretty and wait a couple of minutes?

That should guarantee a run of the expire logic and will get you all the debug output should something fail then.

That seems to be the problem you cited :

DEBUG [2022-05-02T14:12:13Z] Event listener server handler started listener=13d7ab49-50e3-42ca-9bcb-89b8a2fc7ca9 local=/var/snap/lxd/common/lxd/unix.socket remote=@
DEBUG [2022-05-02T14:12:23Z] New task Operation: 0688843b-d0b4-44d1-b022-ccb74f01ddaa
INFO [2022-05-02T14:12:23Z] Deleting container created=“2022-04-09 03:42:08.450444473 +0000 UTC” ephemeral=false instance=mysql8/mysql8-09-04-2022 instanceType=container project=default used=“0001-01-01 00:00:00 +0000 UTC”
INFO [2022-05-02T14:12:23Z] Pruning expired instance snapshots
DEBUG [2022-05-02T14:12:23Z] Started task operation: 0688843b-d0b4-44d1-b022-ccb74f01ddaa
INFO [2022-05-02T14:12:23Z] Done pruning expired instance snapshots
DEBUG [2022-05-02T14:12:23Z] DeleteInstanceSnapshot started instance=mysql8/mysql8-09-04-2022 project=default
DEBUG [2022-05-02T14:12:23Z] Deleting instance snapshot volume instance=mysql8/mysql8-09-04-2022 project=default snapshotName=mysql8-09-04-2022 volName=mysql8
DEBUG [2022-05-02T14:12:23Z] DeleteInstanceSnapshot finished instance=mysql8/mysql8-09-04-2022 project=default
DEBUG [2022-05-02T14:12:23Z] Failure for task operation: 0688843b-d0b4-44d1-b022-ccb74f01ddaa: Failed to delete expired instance snapshot “mysql8/mysql8-09-04-2022” in project “default”: Failed setting subvolume writable “/var/snap/lxd/common/lxd/storage-pools/default/containers-snapshots/mysql8/mysql8-09-04-2022”: Failed to run: btrfs property set -ts /var/snap/lxd/common/lxd/storage-pools/default/containers-snapshots/mysql8/mysql8-09-04-2022 ro false: ERROR: Could not set subvolume flags: Read-only file system

“mysql8” is the container I named “c1” before.

So the new question would be : how do I delete the snapshot to unblock the automatic deletion ?

Are you getting errors in dmesg? Because the btrfs call should work normally, so it may suggest some kind of btrfs corruption…

Good news, the automatic update of LXD (5.1) corrected my problem. It’s frustrating not knowing exactly what caused it but it’s likely that there is a bug on version 5.0.

Thanks for your help @stgraber and if anyone has the answer to the problem, I would be interested. Maybe it will help people who had the same problem.

1 Like

This was likely fixed in 5.1 via the series of fixes around optimized BTRFS refresh:

Thanks !