we have one container that we try to take a snapshot and then export.
It is not working, like ext4 FS of the snapshot is having problems.
We tried many times (with the container running and stopped).
Below are the commands and the error we are getting.
incus snapshot create CONTAINERNAME snapforbackup
incus export CONTAINERNAME /mnt/foo/CONTAINERNAME.tar --compression=none
Error: Create backup: Backup create: Failed to mount LVM snapshot volume: Failed to mount "/dev/default/containers_CONTAINERNAME-snapforbackup" on "/var/lib/incus/storage-pools/default/containers-snapshots/CONTAINERNAME/snapforbackup" using "ext4": structure needs cleaning
If we look at dmesg output while the incus export is running, we see many messages like those below:
kernel: EXT4-fs (dm-7): recovery complete
kernel: EXT4-fs error (device dm-7): ext4_mark_recovery_complete:6249: comm incusd: Orphan file not empty on read-only fs.
kernel: EXT4-fs (dm-7): mount failed
kernel: EXT4-fs (dm-7): write access unavailable, skipping orphan cleanup
kernel: EXT4-fs (dm-7): recovery complete
kernel: EXT4-fs error (device dm-7): ext4_mark_recovery_complete:6249: comm incusd: Orphan file not empty on read-only fs.
kernel: EXT4-fs (dm-7): mount failed
kernel: EXT4-fs (dm-7): write access unavailable, skipping orphan cleanup
kernel: EXT4-fs (dm-7): recovery complete
kernel: EXT4-fs error (device dm-7): ext4_mark_recovery_complete:6249: comm incusd: Orphan file not empty on read-only fs.
kernel: EXT4-fs (dm-7): mount failed
We are using LVM storage.
Is this something known?
Any ideas on how to fix this?
This looks like an issue with your LV or underlying disk.
First you’ll want to make sure that your VG still has some space available as it running out of space could cause issues like those.
If it has plenty of free space, then you’ll likely need to stop the container, manually activate the LV and run fsck.ext4 against it, that may find some issues which will need to be fixed, at which point making a new snapshot and export should work.
It’s not impossible that your LV is currently in good shape and you just had some bad luck on snapshot timing, basically creating the snapshot when some data was in-flight, but LVM has some logic to usually prevent that from happening.
We have space on the VG (first thing to check), there is plenty.
We did fsck -fy on the volume ( incus stop CONTAINERNAME; lvchange -ay -Ky default/containers_CONTAINERNAME; fsck -fy /dev/default/containers_CONTAINERNAME )
We get the same errors afterwards on retry.
(actually my first message here was after fsck-ing the lv of the snapshot)
What we did and seems working so far without this problem is the following:
I wonder if ext4 somehow introduced a property similar to xfs where it effectively prevents two concurrent mount of the same fsid even if they’re distinct superblocks, though if that was the case, you’d really expect a better error…
Can you try incus copy CONTAINERNAME/SNAPSHOT CONTAINERNAME-COPY and then incus export CONTAINERNAME-COPY? Basically the goal is to see if the snapshot becomes fine when it is actually copied into a new instance and can be written to.