Error removing LVM logical volume

That’s annoying as that doesn’t get us any real history to see what happened.

It shows an issue with namespace management which then caused over-mounting…
We’ve seen this on and off but haven’t yet found a reliable reproducer making it near impossible to track this down.

I got 9 nodes on this Project, could find it on 2.
1 was rebooted, 2 where not affected, one I found that what I posted now.

I check later if on the remaining ones I can find a container affected by this, I am pretty sure, that I will.

All of them run using the same LVM backend setup, but according to other posts it may not related to LVM, since one user reported it with ZFS.

Yeah, this issue has mostly been seen on ZFS, likely because we have far more ZFS users than LVM.

Our best guess is that it has to do with a system having running containers, combined with an update to some of the core snaps and combined with a LXD snap update, some combination of this then results in a mount namespace configuration which our reshuffling tool can’t deal with, resulting in the error we can see in your journal.

Unfortunately fixing this properly will require us being able to reproduce this issue at will so we can significantly increase the debugging in the reshuffling tool as well as take detailed dumps of all mount tables at play.

In your case, a workaround for that one system would likely be:

  • nsenter -t 1471 -m umount -l /var/snap/lxd/common/shmounts
  • nsenter -t 1471 -m umount -l /var/snap/lxd/common/shmounts
  • nsenter -t 1471 -m umount -l /var/snap/lxd/common/shmounts/storage-pools/primary/containers/lxc14fc3901

That would undo the two level of overmounting and then unmount the hidden mount.

So basically the work around would be, disable all lxd related auto update over snapd.
Reboot after a snapd update, subscribe to your “newsletter” so whe know when, what to patch.

I found a few nodes more:

Also the function seems to be fucked.
journalctl -u snap.lxd.daemon -S 2021-9-1
Returns data from september 2021 but not 500 lines.

journalctl -u snap.lxd.daemon -n 500
Returns data from May 2021, also not 500 lines

There seems to be more log but I can’t get it printed.
I know there is data from September 29 and it dosen’t show it to me.

-r for reverse yes gives me the current one, but never 500 lines, more like hard coded 50.
I would like to give you the data but the system dosen’t give it to me.

I found this on way more machines than expected, nearly rebooted everything now.
snap is a plague, please add support for deb packages.

Found another one, despite running manual updates.
I did blocked snapd and ran manual updates every now and then, but snap does not seem to be the issue, the issue still appears. Looks like my assumption was wrong.