Running virtual machines with LXD 4.0

stgraber · April 22, 2020, 2:37pm

Introduction

LXD 4.0 natively supports virtual machines and thanks to a built-in agent, can have them behave almost like containers.

Images

Community images (`images:`)

We are producing VM images daily for the following distributions:

Arch Linux
CentOS (7 and up)
Debian (8 and up)
Fedora
Gentoo
OpenSUSE
Ubuntu

For those that have cloud variants for container images, they have matching cloud variants for VM too.

Those images are currently the preferred ones for all of those distributions as they are automatically tested daily and also include support for the LXD agent out of the box.

Official Ubuntu images (`ubuntu:`)

You can also use the official Ubuntu images, with availability matching that of containers.

Note that those images currently do not include a fully functional LXD agent, so additional work is needed to have them behave.

Creating a VM

Creating a VM is as simple as:

lxc launch images:ubuntu/focal ubuntu --vm

or

lxc launch images:centos/9-Stream centos --vm

Extra steps for official Ubuntu images

For official Ubuntu images, cloud-init must be used along with a config drive to seed a default user into the VM and allow console access. This is done with:

lxc init ubuntu:18.04 ubuntu --vm
(
cat << EOF
#cloud-config
apt_mirror: http://us.archive.ubuntu.com/ubuntu/
ssh_pwauth: yes
users:
  - name: ubuntu
    passwd: "\$6\$s.wXDkoGmU5md\$d.vxMQSvtcs1I7wUG4SLgUhmarY7BR.5lusJq1D9U9EnHK2LJx18x90ipsg0g3Jcomfp0EoGAZYfgvT22qGFl/"
    lock_passwd: false
    groups: lxd
    shell: /bin/bash
    sudo: ALL=(ALL) NOPASSWD:ALL
EOF
) | lxc config set ubuntu user.user-data -
lxc config device add ubuntu config disk source=cloud-init:config
lxc start ubuntu

Accessing the VM

You can see your VM boot with:

lxc console NAME

(detach with ctrl+a-q)

Once booted, VMs with the agent built-in will also respond to:

lxc exec NAME bash

Extra steps for official Ubuntu images

For the official Ubuntu images, the agent needs to be manually enabled by logging in through lxc console. The credentials in the example above are ubuntu/ubuntu.
Once logged in, run:

mount -t 9p config /mnt
cd /mnt
./install.sh
reboot

(Note that this part will slowly become un-necessary. The 20.04 images should already have this done and so have lxc exec just work after boot.)

What to do next

Virtual machines respect most of the usual instance settings as listed in the documentation.

You can attach nic and disk devices to them, grow their CPU & RAM, take snapshots, move them between hosts, use them in a cluster, publish them as images, …

They should pretty much all behave as you would expect a LXD instance, just a bit slower because you’re dealing with a full virtual machine.

Running Windows

It is possible to build Windows images for LXD, but this process is currently very very manual and involves either a bunch of custom raw.qemu flags to get a temporary graphical console to perform the install or using a separate QEMU process to prepare the image.

The basic steps are:

Grab a Windows ISO image from Microsoft
Install distrobuilder: snap install distrobuilder --classic
Repack your ISO with: distrobuilder repack-windows /path/to/iso win.iso
Create an empty VM with beefier CPU/RAM and SecureBoot disabled:
lxc init win10 --empty --vm -c security.secureboot=false -c limits.cpu=4 -c limits.memory=4GB
Grow its root disk to a reasonable size:
lxc config device override win10 root size=20GB
Add the ISO as boot drive:
lxc config device add win10 install disk source=/path/to/win.iso boot.priority=10
Start the VM: lxc start win10 --console
Repeatedly hit ESC in that console (even before any output) to enter the firmware menu.
Select Boot Manager and then the QM00001 drive. Then hit ENTER a few times to answer an invisible boot prompt.
Disconnect using ctrl+a-q
Use the VGA console with lxc console win10 --type=vga, you’ll see the installer boot
Once installed you can remove the install drive with:
lxc config device remove win10 install
Boot the system to confirm it all works, install all the other drivers from the virtio drive, then run the Windows sysprep tool
Finally publish your VM as an image with:
lxc publish win10 --alias win10

Caveats

Community images are only available for x86_64 and aarch64
Some distributions are not Secure Boot enabled, this will show up as a boot failure with something along the lines of Access Denied. For those, secureboot must be disabled with lxc config set NAME security.secureboot false, then the VM started again.

simos · April 22, 2020, 9:15pm

I am trying to get Windows 10 running in a VM.
I downloaded the two ISOs (Windows 10 and virtio), and adapted the raw.qemu with the absolute paths accordingly.
When I lxc start win10 and at the same time in another console run lxc console win10, I get the following:


BdsDxe: failed to load Boot0001 "UEFI QEMU QEMU HARDDISK " from PciRoot(0x0)/Pci(0x2,0x0)/Pci(0x0,0x0)/Scsi(0x0,0x1): Not Found

>>Start PXE over IPv4.
  PXE-E21: Remote boot cancelled.
...

It then tries to boot PXE over HTTP and also fails. Finally, I get into the UEFI shell.

UEFI Interactive Shell v2.2
EDK II
UEFI v2.70 (EDK II, 0x00010000)
Mapping table
     BLK0: Alias(s):
          PciRoot(0x0)/Pci(0x2,0x0)/Pci(0x0,0x0)/Scsi(0x0,0x1)
Press ESC in 5 seconds to skip startup.nsh or any other key to continue.
Shell>

I did not get a Boot Manager.

stgraber · April 22, 2020, 9:16pm

If you see any text, you’re too late, you need to really hammer ESC immediately after you start lxc console to catch it before it attempts to boot.

stgraber · April 22, 2020, 9:17pm

If you get into the EFI shell, you may be able to manually start the binary from there with something like:

fs0:
cd efi\boot
bootx64.efi

simos · April 22, 2020, 9:59pm

I managed to get the Boot Manager, by hammering ESC very quickly.

Once you eventually get to the UEFI shell, type reset and press Enter. This will reboot the VM, giving you the chance to try again to get to the Boot Manager.
The VM reboots, and lxc console will exit at once.
You need to run quickly again lxc console win10 (press Up Arrow, then Enter), and hit ESC immediately after. You should not be too fast because you get the LXD error that the VM is not running, but neither too slow because you will miss the Boot Manager.
With a couple of tries you will be able to gauge the timing.
If it fails, try again by going back to Step 1.

morphis · April 26, 2020, 4:28pm

Using a

$ lxc start win10 ; lxc console win10

with immediately hammering ESC after you pressed Enter helps to deal easily with the timing.

lvanni · April 27, 2020, 7:26am

Hi @stgraber, in next LXD release is it possible to use graphical console without workarounds? (es. KVM style)

tomp · April 27, 2020, 8:39am

Yes that is the plan.

Wolfgang · April 28, 2020, 1:36am

All is well for me (init, download, etc) until I start the VM (lxc start). I then get this error message:-

Error: Failed to run: modprobe vhost_vsock: modprobe ERROR: could not insert ‘vhost_vsock’

stgraber · April 28, 2020, 1:38am

Does it say why vhost_vsock won’t load?

It’s sometimes because of some vmware/virtualbox tools being already loaded.

Wolfgang · April 28, 2020, 3:41am

It says ‘device or resource busy’.

It is a VM running Ubuntu with lxc/lxd 4.0.1 and yes it has open-vm-tools running.

Thanks.

tomp · April 28, 2020, 9:07am

If you run lsmod | grep vsock it might show which existing vsock modules are loaded, you then need to rmmod them so that LXD can load the vhost_vsock module.

stgraber · April 28, 2020, 11:03am

vsock doesn’t work with nested virtual machines at this time, which unless there’s no vsock used by the parent hypervisor, effectively prevents running LXD virtual machines inside of an existing virtual machine.

It’s worth noting that nested virtualization, at least on Intel platforms is also not always reliable. If you want to test LXD virtual machines, a bare metal host is strongly advised.

Wolfgang · April 28, 2020, 4:43pm

Just now I tried the same thing on a Digital Ocean Droplet (a fancy name for a VM) in the cloud. And it works!

I do not know which hypervisor Digital Ocean uses, but it clearly works with that one nested. Normally with neofetch I can see which hypervisor is being used. For example with AWS neofetch says "Host: HVM domU”, which I believe is Xen. And with Linode it says “Host: KVM/QEMU”. But with Digital Ocean it says “Host: Droplet” - I guess they are hiding it?

Interestingly, the VM running nested says “Host: KVM/QEMU” and not “Host: LXC/LXD”!

It is a shame that it will not work under VMware. I hope one day this is fixed.

By the way you mentioned that nested virtualization is not always reliable on Intel. So I assume you are saying that AMD is a better choice? Is there a specific technical reason for this?

Thanks.

stgraber · April 28, 2020, 4:58pm

I suspect they’re using QEMU but do not use vsock at all, leaving it free for the VM to use.

For nested virt, I’m not a CPU expert but what I’ve been told is that on the Intel side, VMs effectively are tracked on a flat plane. So nested virt actually means a VM being able to create a VM parallel to itself. AMD instead does have a tree type structure where a child VM is tracked by the CPU as a child of its parent. I don’t expect this to impact performance, but this likely impacts stability and security.

Wolfgang · April 28, 2020, 10:16pm

I just tried on a Linode VM and it failed with this error message:-

Error: Failed to run: /snap/lxd/current/bin/lxd forklimits limit=memlock:unlimited:unlimited – /snap/lxd/14890/bin/qemu-system-x86_64 -S -name ubuvm -uuid e00f23fb-146c-4ddf-8662-be1690eb5cbb -daemonize -cpu host -nographic -serial chardev:console -nodefaults -no-reboot -no-user-config -sandbox on,obsolete=deny,elevateprivileges=allow,spawn=deny,resourcecontrol=deny -readconfig /var/snap/lxd/common/lxd/logs/ubuvm/qemu.conf -pidfile /var/snap/lxd/common/lxd/logs/ubuvm/qemu.pid -D /var/snap/lxd/common/lxd/logs/ubuvm/qemu.log -chroot /var/snap/lxd/common/lxd/virtual-machines/ubuvm -smbios type=2,manufacturer=Canonical Ltd.,product=LXD -runas lxd: : exit status 1
Try ‘lxc info --show-log ubuvm’ for more info
root@localhost:~# lxc info --show-log ubuvm
Name: ubuvm
Location: none
Remote: unix://
Architecture: x86_64
Created: 2020/04/28 22:01 UTC
Status: Stopped
Type: virtual-machine
Profiles: default

Log:

Could not access KVM kernel module: No such file or directory
qemu-system-x86_64: failed to initialize KVM: No such file or directory

The VM is running a fresh install of Ubuntu 20.04 LTS. It has 2 CPU & 4GB RAM.

stgraber · April 28, 2020, 10:19pm

That indicates you’re missing the kvm kernel modules.
On intel, that’s kvm and kvm_intel.
On AMD, that’s kvm and kvm_amd.

Depending on the kernel you’re running, those may not be available, which would explain this behavior.

Wolfgang · April 28, 2020, 10:26pm

Because it is a snap package, does it not include all the dependencies like KVM?

By the way, I am a complete beginner and new to Linux! So forgive me if these are dumb questions

stgraber · April 28, 2020, 10:27pm

Snaps include all the userspace bits they need, they however do not get to run their own kernels, so you still need your system’s kernel to support all the features we need.

Running virtual machines with LXD 4.0

Introduction

Images

Community images (images:)

Official Ubuntu images (ubuntu:)

Creating a VM

Extra steps for official Ubuntu images

Accessing the VM

Extra steps for official Ubuntu images

What to do next

Running Windows

Caveats

Community images (`images:`)

Official Ubuntu images (`ubuntu:`)