Can't create VM instance on a thinpool (Main and backup partition tables differ!)

Hi,

I’ve encountered a strange problem with Incus 6.10.1 after upgrade. I think it’s probably related to my setup because I cannot find anything similar here, in issues or via search services. Basically I can’t create VM instances like this:

$ incus launch images:alpine/3.20/cloud testthin0 -s thin0 -c limits.memory=2GiB -c limits.cpu=8 --vm

I get this output:

Launching testthin0
Error: Failed instance creation: Failed creating instance from image: Failed to run: /usr/sbin/sgdisk --move-second-header /dev/vg0/images_6dc6e0522da81db5c254e14e038860636e1e26edd3e6748d4a5f1d32df7c16aa.block: exit status 4 (Warning! Main and backup partition tables differ! Use the ‘c’ and ‘e’ options
on the recovery & transformation menu to examine the two tables.

Warning! One or more CRCs don’t match. You should repair the disk!
Main header: OK
Backup header: OK
Main partition table: ERROR
Backup partition table: ERROR

Aborting write operation!)

I have four different storage pools:

  • thin0 - thinpool storage on 2xNVMe drives in RAID 1
  • thin1 - thinpool storage on 4xSATA SSD in RAID10
  • pool0 - btrfs storage on the NVMe drives
  • pool1 - btrfs storage on the SATA drives

I can create VM just fine on both btrfs pools. The error is returned for both thinpools. I tested multiple images with the same result. Image storage is on pool0. I tested pool1 to check if the drives are affected but I get the same problem.

I have no idea how to debug this. It seems like the image is somehow broken but I have no idea why.

The issue is on my home server running Fedora Server. I have my notebook with similar setup (thinpool storage, same version of Incus, Fedora Desktop) and everything works just fine there. That leads me to conclusion that the problem is in my setup on the home server

Can I please ask you to help me to debug this issue?

Might be worth checking if it’s a 512 vs 4096 sector size issue with the LVM pool maybe?

I don’t think it’s that. Both kind of drives are formatted to 512B


The thing is that everything was fine until recent update of the system where Incus 6.10.1 was included.

Just an update. I tested my notebook again with, as I mentioned, almost the same software setup. The VM instance can be created but it won’t boot. I think the issue is either related to Fedora or latest kernel 6.13.

Are you getting any errors in dmesg or journalctl?

Before I answer I tried to return to kernel 6.12.7 and the issue remained.

I also tried to create a new thinpool (in remaining space of one of the RAID arrays) and it’s the same result.

Then I tried to image.os set to windows to disable IOMMU (not sure it was even enabled), same result.

And one note, one of the existing VMs, that was perfectly fine before, didn’t boot up after the issue occurred - the same behaviour as on my notebook - just stuck during boot.

Some errors are in Incus log:

The errors from 18:27 occurred when I tried to run:

incus launch images:debian/bookworm/cloud testthin0 -s thin0 -c limits.memory=2GiB -c limits.cpu=8 --vm -c security.secureboot=false

Today, without any change in the config, hw or system it started behaving the same as on my notebook. The VM instance is created but refuses to boot. The boot process is stuck here:

Isn’t possible the images could be broken in some way? The only change is updated Debian image downloaded today afternoon:

The same image is working fine on Debian 12 + Incus 6.9. I noticed a different logo during boot:

Isn’t possible there is a different BIOS image in Fedora? Something incompatible with whatever Incus has in its images?

No errors in dmesg:

Fedora is going to have its own EDK2 (firmware) and QEMU build, so that may be different from the ones in the Zabbly repository.

So I did a few experiments and “fixed” it for both of my Fedora systems. I had Incus from ganto/lxc4/ repo.

https://copr.fedorainfracloud.org/coprs/ganto/lxc4/

I switched to Incus from Fedora’s main repo which is version 6.8. It still didn’t work but I noticed that if I delete all images from all projects and let Incus to download them again it starts working as before.

So I think Incus 6.10 treats images differently during importing into the local storage and as result those images are not compatible with whatever is specific for Fedora.

The deleted images and the newly downloaded images had the same fingerprints.

I would like to open this thread again because the problem is back and I think something is wrong with the VM images or LVM thinpool driver.

If you use thinpool as a VM image storage can you please try this command?

incus launch images:debian/12/cloud testvm -s thin0 -c limits.cpu=2 -c limits.memory=8GiB -d root,size=128GiB --vm

And check if your VM boots up?

I can’t see this issue with Btrfs pool.

What I also noticed is that incus export successfully exported an instance and when I imported it on another machine the disk image was unbootable and data damaged. I would say it’s hardware problem but I experience the first or the other issue on five different machines with Fedora and Debian. Both issues always occurs on LVM thinpool.

One more thing I noticed. I tried to run incus launch multiple times in row and ended up with running instance that’s stack in the boot sequence.

I think Incus somehow damages data in the disk image during importing on LVM thinpools. I have TBs of data on instances created in past and there is no issue with it. It always gets damaged when Incus imports data into a fresh thinpool volume. Either with import command or launch command. I am pretty sure that if I install the system from ISO it will be ok.

Update: I installed Debian on a new VM instance with no issue.

I think I found a solution. I created those thinpools manually and it seems that zeroing needs to be enabled. I guess Incus enables this for its own thinpools.

I haven’t tested it much but this did the trick for my initial runs:

lvchange -Z y vgname/ThinPool