Virtual machine on LVM unable to boot

Hello,

I’m currently testing the LVM driver for my local homelab server. Everything is working with containers, but for the virtual machine part, it’s not working at all, the VM is unable to boot. I’m also using the official image from the official image server.

On this server I also have a ZFS pool on another disk and the virtual machines are booting without any issues.

$ incus launch images:ubuntu/22.04 v1 --vm --console -s lvmp01 -c security.secureboot=false
Launching v1
Retrieving image: Unpack: 100% (3.09GB/s)
BdsDxe: failed to load Boot0001 "UEFI QEMU QEMU HARDDISK " from PciRoot(0x0)/Pci(0x1,0x1)/Pci(0x0,0x0)/Scsi(0x0,0x1): Not Found

>>Start PXE over IPv4.

Same issue with Ubuntu 24.04, Fedora 40 or Debian 12 image.

Here is the command I use for creating the LVM pool:

incus storage create lvmp01 lvm \
  source=/dev/nvme0n1 \
  lvm.vg_name=incus-nvme0n1 \
  lvm.thinpool_name=lvmp01 \
  lvm.thinpool_metadata_size=1GiB \
  volume.block.filesystem=ext4 \
  volume.block.mount_options="noatime,discard"

Here is some informations on my setup:

  • OS: Ubuntu 22.04.4
  • Incus version: 6.2 (package from Zabbly repository)
  • Kernel version: 6.5.0-41 (HWE kernel)

Do I miss something ? Any ideas where should I look ?

Thanks !

I’m not seeing any obvious problem there. The most likely issue would be related to the LV resizing prior to the instance starting up but I’m not sure how that would lead to the EFI partition not being discoverable.

Maybe look for errors dmesg, post the full incus config show --expanded v1, lvs output and try another image, maybe something tiny like images:alpine/edge to see if that behaves the same?

Hello @stgraber,

Thanks for your message.

As for testing with Alpine, sadly exactly the same error, it seems the error is before the system is able to boot (in QEMU EFI ?)

Here is the output from some tools:

# dmesg when launching a VM
[10461.814461] virbr0: port 1(tapdfc79799) entered disabled state
[10461.912896] EXT4-fs (dm-10): unmounting filesystem 9df8bd1b-de18-4548-9feb-497ade472c15.
[10462.287282] audit: type=1400 audit(1720294564.958:169): apparmor="STATUS" operation="profile_remove" profile="unconfined" name="incus-consul_v1_</var/lib/incus>" pid=7844 comm="apparmo
r_parser"
[10469.616966] EXT4-fs (dm-9): mounted filesystem 9df8bd1b-de18-4548-9feb-497ade472c15 r/w with ordered data mode. Quota mode: none.
[10469.620277] EXT4-fs (dm-9): unmounting filesystem 9df8bd1b-de18-4548-9feb-497ade472c15.
[10470.115021] EXT4-fs (dm-9): mounted filesystem 9df8bd1b-de18-4548-9feb-497ade472c15 r/w with ordered data mode. Quota mode: none.
[10470.119131] EXT4-fs (dm-9): unmounting filesystem 9df8bd1b-de18-4548-9feb-497ade472c15.
[10470.655318] EXT4-fs (dm-10): mounted filesystem 9df8bd1b-de18-4548-9feb-497ade472c15 r/w with ordered data mode. Quota mode: none.
[10470.675579] virbr0: port 1(tap9ad7fc97) entered blocking state
[10470.675584] virbr0: port 1(tap9ad7fc97) entered disabled state
[10470.675599] tap9ad7fc97: entered allmulticast mode
[10470.675655] tap9ad7fc97: entered promiscuous mode
[10470.737059] audit: type=1400 audit(1720294573.406:170): apparmor="STATUS" operation="profile_load" profile="unconfined" name="incus-consul_v1_</var/lib/incus>" pid=7967 comm="apparmor_
parser"
[10470.769620] audit: type=1400 audit(1720294573.442:171): apparmor="DENIED" operation="open" class="file" profile="incus-consul_v1_</var/lib/incus>" name="/proc/sys/vm/max_map_count" pid
=7968 comm="qemu-system-x86" requested_mask="r" denied_mask="r" fsuid=0 ouid=0
[10470.844127] virbr0: port 1(tap9ad7fc97) entered blocking state
[10470.844133] virbr0: port 1(tap9ad7fc97) entered forwarding state
$ incus config show --expanded v1
architecture: x86_64
config:
  image.architecture: amd64
  image.description: Ubuntu jammy amd64 (20240706_07:42)
  image.os: Ubuntu
  image.release: jammy
  image.serial: "20240706_07:42"
  image.type: disk-kvm.img
  image.variant: default
  limits.kernel.nofile: "65536"
  limits.processes: "4096"
  security.guestapi: "true"
  security.idmap.isolated: "true"
  security.secureboot: "false"
  security.syscalls.intercept.sysinfo: "true"
  volatile.base_image: 4c50a603e2f2cbc9ede5a4876b20c4962b0b8003b45c207ca4f51c902359e483
  volatile.cloud-init.instance-id: 83f748c5-6726-4883-aa25-cf4caf2a2802
  volatile.eth0.host_name: tap0e7e3cf1
  volatile.eth0.hwaddr: 00:16:3e:0b:77:4a
  volatile.last_state.power: RUNNING
  volatile.uuid: 99df7068-e326-420b-8721-39f90a317ac3
  volatile.uuid.generation: 99df7068-e326-420b-8721-39f90a317ac3
  volatile.vsock_id: "604752454"
devices:
  eth0:
    name: eth0
    network: virbr0
    type: nic
  root:
    path: /
    pool: lvmp01
    type: disk
ephemeral: false
profiles:
- default
stateful: false
description: ""
$ sudo lvs -a
  LV                                                                            VG            Attr       LSize    Pool   Origin                                                                        Data%  Meta%  Move Log Cpy%Sync Convert
  images_2e3856614c34ee6bc4d7bbe73e4b6c449c5af830aad746b8db1c097f9c4e916d       incus-nvme0n1 Vwi---tz-k  500.00m lvmp01                                                                                                                      
  images_2e3856614c34ee6bc4d7bbe73e4b6c449c5af830aad746b8db1c097f9c4e916d.block incus-nvme0n1 Vwi---tz-k   10.00g lvmp01                                                                                                                      
  images_4c50a603e2f2cbc9ede5a4876b20c4962b0b8003b45c207ca4f51c902359e483       incus-nvme0n1 Vwi---tz-k  500.00m lvmp01                                                                                                                      
  images_4c50a603e2f2cbc9ede5a4876b20c4962b0b8003b45c207ca4f51c902359e483.block incus-nvme0n1 Vwi---tz-k   10.00g lvmp01                                                                                                                      
  images_8d12ed56467ff120c2a6f9bc286c67bf2a1fde6a2d51c50858ddcd5f5f679c9f       incus-nvme0n1 Vwi---tz-k  500.00m lvmp01                                                                                                                      
  images_8d12ed56467ff120c2a6f9bc286c67bf2a1fde6a2d51c50858ddcd5f5f679c9f.block incus-nvme0n1 Vwi---tz-k   10.00g lvmp01                                                                                                                      
  lvmp01                                                                        incus-nvme0n1 twi-aotz-- <951.87g                                                                                      1.28   2.26                            
  [lvmp01_tdata]                                                                incus-nvme0n1 Twi-ao---- <951.87g                                                                                                                             
  [lvmp01_tmeta]                                                                incus-nvme0n1 ewi-ao----    1.00g                                                                                                                             
  [lvol0_pmspare]                                                               incus-nvme0n1 ewi-------    1.00g                                                                                                                             
  virtual-machines_consul_v1                                                    incus-nvme0n1 Vwi-aotz-k  500.00m lvmp01 images_4c50a603e2f2cbc9ede5a4876b20c4962b0b8003b45c207ca4f51c902359e483       9.73                                   
  virtual-machines_consul_v1.block                                              incus-nvme0n1 Vwi-aotz-k   10.00g lvmp01 images_4c50a603e2f2cbc9ede5a4876b20c4962b0b8003b45c207ca4f51c902359e483.block 40.00                                  
  home                                                                          system        -wi-ao----    5.00g                                                                                                                             
  incus                                                                         system        -wi-ao----   50.00g                                                                                                                             
  logs                                                                          system        -wi-ao----   10.00g                                                                                                                             
  root                                                                          system        -wi-ao----   20.00g                                                                                                                             
  swap                                                                          system        -wi-ao----   16.00g 

The mapping in EFI:

Shell> map -r
Mapping table
     BLK0: Alias(s):
          PciRoot(0x0)/Pci(0x1,0x1)/Pci(0x0,0x0)/Scsi(0x0,0x1)

Thanks

Pretty weird, not seeing anything obviously wrong there, but then UEFI doesn’t see any partitions on the disk so that’s definitely a problem…

Maybe try running gdisk -l against /dev/incus-nvme0n1/virtual-machines_consul_v1.block once the VM is started (otherwise the LV won’t be active).

Here is the output from gdisk as requested, something looks wrong in the partition listing:

$ sudo gdisk -l /dev/incus-nvme0n1/virtual-machines_consul_v1.block
GPT fdisk (gdisk) version 1.0.8

Partition table scan:
  MBR: protective
  BSD: not present
  APM: not present
  GPT: present

Found valid GPT with protective MBR; using GPT.
Disk /dev/incus-nvme0n1/virtual-machines_consul_v1.block: 2621440 sectors, 10.0 GiB
Sector size (logical/physical): 4096/4096 bytes
Disk identifier (GUID): E4B5FD45-A6A5-4EEA-8AB1-937173F2E34E
Partition table holds up to 128 entries
Main partition table begins at sector 2 and ends at sector 5
First usable sector is 6, last usable sector is 2621434
Partitions will be aligned on 256-sector boundaries
Total free space is 2621429 sectors (10.0 GiB)

Number  Start (sector)    End (sector)  Size       Code  Name
stgraber@dakara:~$ incus storage create lvm lvm
Storage pool lvm created
stgraber@dakara:~$ incus launch images:alpine/edge a1 --vm --storage lvm -c security.secureboot=false
Launching a1
stgraber@dakara:~$ incus exec a1 sh       
~ # 

Can you try doing something like this on your system too to see if you run into the same issue as your more complex LVM setup?

Hello @stgraber,

It’s weird because when using the loop device as backend, it’s working as expected:

$ incus launch images:ubuntu/22.04 v1 --vm --console -s lvmtest -c security.secureboot=false
BdsDxe: loading Boot0001 "UEFI QEMU QEMU HARDDISK " from PciRoot(0x0)/Pci(0x1,0x1)/Pci(0x0,0x0)/Scsi(0x0,0x1)
BdsDxe: starting Boot0001 "UEFI QEMU QEMU HARDDISK " from PciRoot(0x0)/Pci(0x1,0x1)/Pci(0x0,0x0)/Scsi(0x0,0x1)

Ubuntu 22.04.4 LTS v1 ttyS0

v1 login: 

And the gdisk output of the LV:

$ sudo gdisk -l /dev/lvmtest/virtual-machines_consul_v1.block 
GPT fdisk (gdisk) version 1.0.8

Partition table scan:
  MBR: protective
  BSD: not present
  APM: not present
  GPT: present

Found valid GPT with protective MBR; using GPT.
Disk /dev/lvmtest/virtual-machines_consul_v1.block: 20971520 sectors, 10.0 GiB
Sector size (logical/physical): 512/512 bytes
Disk identifier (GUID): C998F5D5-9BD5-4060-849A-12C2E5E5A7C2
Partition table holds up to 128 entries
Main partition table begins at sector 2 and ends at sector 33
First usable sector is 34, last usable sector is 20971486
Partitions will be aligned on 2048-sector boundaries
Total free space is 2014 sectors (1007.0 KiB)

Number  Start (sector)    End (sector)  Size       Code  Name
   1            2048          206847   100.0 MiB   EF00  
   2          206848        20971486   9.9 GiB     8300  

I even tried to create the storage pool with the minimum settings possible by just specifying the source disk and it failed:

$ incus storage create lvmp01 lvm source=/dev/nvme0n1
$ incus launch images:ubuntu/22.04 v1 --vm --console -s lvmp01 -c security.secureboot=false

BdsDxe: failed to load Boot0001 "UEFI QEMU QEMU HARDDISK " from PciRoot(0x0)/Pci(0x1,0x1)/Pci(0x0,0x0)/Scsi(0x0,0x1): Not Found

>>Start PXE over IPv4.

Very weird. Can you maybe try:

incus storage create lvm lvm source=/dev/nvme0n1

In the off change that it’s the VG’s name that’s causing an issue somehow?

Nothing changes, the VM is still unable to boot :confused: I’m running out of ideas too…

I have plans to upgrade this machine into Ubuntu 24.04, maybe it could be related to a bug in the LVM tooling with the HWE kernel version I’m running ? I can try to boot on a regular Ubuntu 22.04 kernel and see what’s happens.

Even with 5.15 generic kernel, it doesn’t work :frowning:

Just tried with Incus 6.0.1 LTS on the same system (after cleaning up everything), same issue. I will try with a complete cleaned up environment on a fresh Ubuntu 24.04 installation and keep posted here.

Thanks! This is really really odd as what PGs are in your VGs should really not matter and as far as I can tell, that’s the only difference we’re dealing with here…

This give me an idea. Before blowing up everything, I will also try with a SATA SSD as PV instead as one of the two NVMe, we never know, Linux dark magic can be there :sweat_smile:

It works correctly with a SATA SSD as PV backend:

$ sudo wipefs -a /dev/sdb
$ incus storage create lvmsata lvm source=/dev/sdb
$ incus launch images:ubuntu/22.04 v1 --vm --console -s lvmsata -c security.secureboot=false
BdsDxe: loading Boot0001 "UEFI QEMU QEMU HARDDISK " from PciRoot(0x0)/Pci(0x1,0x1)/Pci(0x0,0x0)/Scsi(0x0,0x1)
BdsDxe: starting Boot0001 "UEFI QEMU QEMU HARDDISK " from PciRoot(0x0)/Pci(0x1,0x1)/Pci(0x0,0x0)/Scsi(0x0,0x1)

Ubuntu 22.04.4 LTS v1 ttyS0

v1 login:

What the f*ck is this ? :rofl:

So something is definitely wrong with NVMe drives handling but where it could be? I love Linux dark magic :heart_eyes:
I don’t see how Incus could fail here, maybe the Linux tooling is failing somewhere probably. How Incus extract the image in the “origin” LV on the storage pool ?

If that make a difference, I use Sabrent Rocket (SB-ROCKET-1TB) as NVMe disks and for the SATA one, this is a Samsung 870 EVO.

I think I found the culprit for the NVMe drives issue, even if I can’t explain why it doesn’t work.

As NVMe drives can report a different logical block address size, I wondered if changing the size would make a difference.

Here is the configuration I had before changing the LBA format:

$ sudo nvme id-ns -H /dev/nvme0n1
...
LBA Format  0 : Metadata Size: 0   bytes - Data Size: 512 bytes - Relative Performance: 0x2 Good 
LBA Format  1 : Metadata Size: 0   bytes - Data Size: 4096 bytes - Relative Performance: 0x1 Better (in use)

Then, I changed the LBA format:

$ sudo nvme format --lbaf=0 /dev/nvme0n1
$ sudo nvme id-ns -H /dev/nvme0n1
...
LBA Format  0 : Metadata Size: 0   bytes - Data Size: 512 bytes - Relative Performance: 0x2 Good (in use)
LBA Format  1 : Metadata Size: 0   bytes - Data Size: 4096 bytes - Relative Performance: 0x1 Better

Finally I recreated the LVM pool and voilà:

$ incus storage create lvm lvm source=/dev/nvme0n1
$ incus launch images:ubuntu/24.04 v1 --vm --console -s lvm -c security.secureboot=false
Launching v1
To detach from the console, press: <ctrl>+a q
BdsDxe: loading Boot0001 "UEFI QEMU QEMU HARDDISK " from PciRoot(0x0)/Pci(0x1,0x1)/Pci(0x0,0x0)/Scsi(0x0,0x1)
BdsDxe: starting Boot0001 "UEFI QEMU QEMU HARDDISK " from PciRoot(0x0)/Pci(0x1,0x1)/Pci(0x0,0x0)/Scsi(0x0,0x1)

Ubuntu 24.04 LTS v1 ttyS0

v1 login:

EDIT: I reproduced the issue easily with an LVM storage pool on top of a custom loop device with a block size of 4k:

$ sudo truncate -s 20G /path/to/disk.img
$ sudo losetup -fP --show -b 4096 /path/to/disk.img
$ incus storage create lvm2 lvm source=/dev/loop0
$ incus launch images:ubuntu/24.04 v2 --vm --console -s lvm2 -c security.secureboot=false
Launching v2
To detach from the console, press: <ctrl>+a q
BdsDxe: failed to load Boot0001 "UEFI QEMU QEMU HARDDISK " from PciRoot(0x0)/Pci(0x1,0x1)/Pci(0x0,0x0)/Scsi(0x0,0x1): Not Found

>>Start PXE over IPv4.

Something is definitively wrong with LVM over a block device with a 4k block size. By default, the loop device created by Incus has a block size of 512 and therefore seems to work fine as with my SATA SSD (that has also a physical and logical block size of 512 from what I see)

Hmm, glad you figured it out, but yeah, it’s very very odd. You’d think LVM could handle a mix of 512 and 4096 blocks and provide a consistent working experience :slight_smile:

1 Like

Yeah I will “format” my NVMe disks with an LBA of 512 bytes, this will be enough since it’s mostly a machine for testing.

I still didn’t found why it’s not possible to boot the VM on an LV with a 4k LBA under the hood. My only theory is that the QCOW2 virtual disk of the root image is using a 512 bytes LBA and then, when the latter is converted in the origin LV, it makes everything go wrong. ZFS must certainly handle this differently and that’s why I didn’t had the issue.