Unable to boot VM after migration from KVM

,

Hello,

I’m running LXD 5.0.2 with ZFS storage backed on Ubuntu 22.04 server. I’m Trying to migrate a VM from an old KVM host. I’ve used /bin.linux.lxd-migrate tool as suggested in LXD documentation and migration passes without issues. On LXD host I have VM created but it is unable to boot.
I’ve tried to start it with :
lxc start --console ag6

and console output shows:

BdsDxe: failed to load Boot0001 "UEFI QEMU QEMU HARDDISK " from PciRoot(0x0)/Pci(0x1,0x1)/Pci(0x0,0x0)/Scsi(0x0,0x1): Not Found

Source VM (on KVM host) is using UEFI Firmwere . I’ve also mounted volume of migrated VM and all data and partitions seems to be there.

Any ideas what could be wrong?

It could be as simple as the image not having a EFI/bootx64.efi binary.
I’d suggest hitting the boot menu (ESC during boot delay), then select the EFI shell.

From there, check that you have a fs0 or fs1 device, and explore it to find the applicable .efi binary for your distro.

Hmm could be. My VM OS is Ubuntu 16.04 LTS and root is on a LVM . I have separate boot partition and under EFI folder I have only ubuntu\grubx64.efi and not bootx64.efi. Strange thing is that that VM is booting without issues on KVM host , here is output from efibootmgr:

sudo efibootmgr -v
BootCurrent: 0000
Timeout: 0 seconds
BootOrder: 0000,0005
Boot0000* ubuntu        HD(1,GPT,ce89234c-404a-446d-84cc-b4c0639a3cba,0x800,0xf3ffe)/File(\EFI\ubuntu\grubx64.efi)
Boot0005* EFI Internal Shell    MemoryMapped(11,0x900000,0x11fffff)/FvFile(7c04a583-9e3e-4f1c-ad65-e05268d0b4d1)

Here is output from EFI Shell on LXD host:

Shell> map -r
Mapping table
      FS0: Alias(s):HD0a1b:;BLK1:
          PciRoot(0x0)/Pci(0x1,0x1)/Pci(0x0,0x0)/Scsi(0x0,0x1)/HD(1,GPT,CE89234C-404A-446D-84CC-B4C0639A3CBA,0x800,0xF3FFE)
     BLK0: Alias(s):
          PciRoot(0x0)/Pci(0x1,0x1)/Pci(0x0,0x0)/Scsi(0x0,0x1)
     BLK2: Alias(s):
          PciRoot(0x0)/Pci(0x1,0x1)/Pci(0x0,0x0)/Scsi(0x0,0x1)/HD(5,GPT,C00617FC-BAD8-4A84-AC69-AFAC1BBB8DB3,0xF4800,0xC70B000)

FS0:\> ls
Directory of: FS0:\
04/07/2023  11:36 <DIR>         8,192  EFI
          0 File(s)           0 bytes
          1 Dir(s)
FS0:\> cd EFI
FS0:\EFI\> ls
Directory of: FS0:\EFI\
04/07/2023  11:36 <DIR>         8,192  .
04/07/2023  11:36 <DIR>             0  ..
04/08/2023  11:09 <DIR>         8,192  ubuntu
          0 File(s)           0 bytes
          3 Dir(s)
FS0:\EFI\> cd ubuntu
FS0:\EFI\ubuntu\> ls
Directory of: FS0:\EFI\ubuntu\
04/07/2023  11:36 <DIR>         8,192  .
04/07/2023  11:36 <DIR>         8,192  ..
04/08/2023  11:33                 201  grub.cfg
04/08/2023  11:33           1,709,952  grubx64.efi
          2 File(s)   1,710,153 bytes
          2 Dir(s)
FS0:\EFI\ubuntu\> 
 

When I try to just boot grubx64.efi VM just stuck on black screen and nothing happens.
Should I try somehow to “re-build” bootx64.efi file on original VM and re-transffer to LXD again?

It may be booting, just not be visible on screen for some reason.

Note that the Ubuntu 16.04 kernel doesn’t work in LXD as it’s missing some virtio drivers, so you’d need to install the HWE kernel (basically same as 18.04) prior to converting to LXD.

efibootmgr shows you the content of the NVRAM (normally a small storage chip on your motherboard), so its content won’t be carried over during a migration as it’s not stored on disk.
That’s why the boot entry disappeared. On Ubuntu 18.04 and later (I believe), Ubuntu makes a copy of grubx64.efi as bootx64.efi to deal with such situations, that way bootx64.efi is run by the firmware and then Ubuntu automatically re-adds the Ubuntu boot entry.

Also worth noting grubx64.efi isn’t a signed binary, so if your VM doesn’t have security.secureboot set to false, that binary will not be able to execute. Ubuntu 16.04 should have secure boot support though, maybe you’re just missing the shim-signed and grub2-efi-signed packages on the source system?

1 Like

I have security.secureboot set to false because I know that I don’t have secure boot enabled for source VM too. Well the missing virtio drivers could be my issue, I will try to install HWE kernel and will let you know if VM boots successfully

OK, I have tried to install HWE kernel on source VM and to re-migrate it again to LXD, unfortunately result is absolutely same. I’ve also checked official Ubuntu VMs images are not using HWE kernel so I guess that is not the problem in my case. Unfortunately I’m out of ideas … I would appreciate any further help or suggestion on that topic.

Have you tried mashing ESC right after starting grubx64.efi?
It’s quite possible that the grub config defaults to being quiet which makes it impossible to see the grub menu. Either try to get to it by hittting ESC or change grub.cfg to show you a menu (easiest is to alter /etc/default/grub on source, then run update-grub before migrating again).

Once you can get into grub, you’ll be able to change the kernel boot parameters to enable verbose output which should let you figure out what’s going on.

1 Like

You are absolutely right! Problem is with that the LXD serial console , the one that I’m using, is not displaying anything for that particular VM after manually execute grubx64.efi. Prior re-configuring GRUB to run in text mode and to show menu, as you suggested, I’ve tried to start VM with VGA console. The result was that after execute grubx64.efi the VM boots normally it even get network up&running. And again you were absolutely right about the kernel (generic one) and its modules too… Even VM to be booted and accessible via network (SSH) no IP address was shown in lxc list output. So I found that lxd-agent service is unable to start because of problem with vsock kernel module. I can confirm that after installing HWE kernel and boot with it, lxd-agent service is starting successfully and IP address of the VM is visible in lxc list output.
Thanks @stgraber to pointing me to right direction!