How to get KVM working in LXD Container

Hi

I needed to test whether KVM will run in a LXD container to run some old non-UEFI images need for production, but not having much luck, so could do with some help please. Trying on my Ubuntu 20.04 AMD64 notebook which currently is LXD v4.22, happily running GUI containers for old apps, a Windows VM. Also able to run VirtualBox VMs fine on the host OS, but never installed KVM on this host.

Lunched Ubuntu 20.04 AMD64 container last night (01 Feb '22), with config:

image.architecture: amd64
image.description: ubuntu 20.04 LTS amd64 (release) (20220131.1)
image.label: release
image.os: ubuntu
image.release: focal
image.serial: “20220131.1”
image.type: squashfs
image.version: “20.04”
security.nesting: “true”
volatile.base_image: 57263910d51e637a64d2d94f6a94832acbd886b2eda532ab0b522b4f9b85bd86
volatile.eth0.host_name: vetha2fc9a63
volatile.eth0.hwaddr: 00:16:3e:a2:07:14
volatile.idmap.base: “0”
volatile.idmap.current: ‘
volatile.idmap.next: ‘
volatile.last_state.idmap: ‘
volatile.last_state.power: RUNNING
volatile.uuid: 751e1470-646e-4b2b-b223-6aa4868d5e49
devices:
kvm:
gid: “108” # gid in container
path: /dev/kvm
type: unix-char

Updated restart container and installed KVM:

apt install qemu-kvm libvirt-daemon-system libvirt-clients bridge-utils

Sadly it can’t find module kvm_intel to start qemu-kvm, and nothing exists below /lib/modules which if memory serves me correctly, could be normal:

root@kvm:~# systemctl status qemu-kvm
● qemu-kvm.service - QEMU KVM preparation - module, ksm, hugepages
Loaded: loaded (/lib/systemd/system/qemu-kvm.service; enabled; vendor preset: enabled)
Active: active (exited) since Wed 2022-02-02 22:22:49 UTC; 1min 2s ago
Process: 127 ExecStart=/usr/share/qemu/init/qemu-kvm-init start (code=exited, status=0/SUCCESS)
Main PID: 127 (code=exited, status=0/SUCCESS)

Feb 02 22:22:49 kvm systemd[1]: Starting QEMU KVM preparation - module, ksm, hugepages…
Feb 02 22:22:49 kvm qemu-kvm-init[145]: modprobe: FATAL: Module kvm_intel not found in directory /lib/modules/5.13.0-28-generic
Feb 02 22:22:49 kvm qemu-kvm-init[152]: mknod: /dev/kvm: File exists
Feb 02 22:22:49 kvm systemd[1]: Finished QEMU KVM preparation - module, ksm, hugepages.

root@kvm:~#
root@kvm:~#
root@kvm:~# modprobe kvm_intel
modprobe: FATAL: Module kvm_intel not found in directory /lib/modules/5.13.0-28-generic

This is an Intel Core i7 machine, and lsmod on this host gives me this:

$ lsmod | grep -i kvm
kvm_intel 303104 0
kvm 864256 1 kvm_intel

Thanks

So I’m not sure that there’s an actual problem here (other than noise).

That systemd unit shows that it completed successfully, you have the kernel module loaded already (through the host, you could set linux.kernel_modules to kvm_intel to have LXD ensure it’s loaded before startup).

Does qemu or libvirt actually fail and if so, why?

Good morning

Thanks for your time… yes, you’re correct… inexperience and other things going on that appeared that this was the issue.

So I see what would be a simple guide to get KVM VMs running in a LXD Ubuntu 20.04 AMD64 unprivileged container:

  • create a standard ubuntu 20.04 AMD64 container (lxc launch ubuntu:20.04/amd64 {ContainerName})
  • allow virtualisation nesting ( lxc config set {ContainerName} security.nesting true)
  • open shell in the container (lxc exec {ContainerName} -- bash)
  • in the container shell, install qemu-kvm (apt update && apt install qemu-kvm libvirt-daemon-system libvirt-clients bridge-utils)
  • in the container shell, edit “/etc/libvirt/qemu.conf” and set “remember_owner = 0# to get around the issue in this article for unprivileged containers
  • in the container shell, get the gid of the kvm group (getent group kvm) # not sure this is still needed
  • on the host shell, to the container’s config, add the kvm device with the gid of the kvm group discovered above (lxc config device add {ContainerName} kvm unix-char path=/dev/kvm gid=???)
  • restart the container (lxc restart {ContainerName})
  • start creating VMs in KVM.

The above steps seem to work with a 20.04 container, however not for a 22.04 container. When installing a VM with virt-install inside the container or with Virt-Manager over SSH, I get this, where"is01a-omel02" is the Dell OpenManage Enterprise appliance, and “qemu-9” is my 9th iteration of the Homer DOH process:

Unable to complete install: 'Unable to write to '/sys/fs/cgroup/devices/machine/qemu-9-is01a-omel02.libvirt-qemu/devices.deny': No such file or directory'

Traceback (most recent call last):
  File "/usr/share/virt-manager/virtManager/asyncjob.py", line 72, in cb_wrapper
    callback(asyncjob, *args, **kwargs)
  File "/usr/share/virt-manager/virtManager/createvm.py", line 2008, in _do_async_install
    installer.start_install(guest, meter=meter)
  File "/usr/share/virt-manager/virtinst/install/installer.py", line 695, in start_install
    domain = self._create_guest(
  File "/usr/share/virt-manager/virtinst/install/installer.py", line 637, in _create_guest
    domain = self.conn.createXML(initial_xml or final_xml, 0)
  File "/usr/lib/python3/dist-packages/libvirt.py", line 4400, in createXML
    raise libvirtError('virDomainCreateXML() failed')
libvirt.libvirtError: Unable to write to '/sys/fs/cgroup/devices/machine/qemu-9-is01a-omel02.libvirt-qemu/devices.deny': No such file or directory

There are no dirs nor files at that path it’s expecting:

~# ls -lh /sys/fs/cgroup/devices/machine/                                                     total 0
-rw-r--r-- 1 root root    0 Jun 23 20:34 cgroup.clone_children
-rw-r--r-- 1 root root    0 Jun 23 20:34 cgroup.procs
--w------- 1 root root    0 Jun 23 20:34 devices.allow
--w------- 1 root root    0 Jun 23 20:34 devices.deny
-r--r--r-- 1 root root    0 Jun 23 20:34 devices.list
-rw-r--r-- 1 root root    0 Jun 23 20:34 notify_on_release
-rw-r--r-- 1 root root    0 Jun 23 20:34 tasks

Continuing the DOH process, I can try be a and create the next iteration’s path and chmod 777 .../qemu-10... the dir but the error remains the same for that iteration. Also unable to write to that path even if I chown root:libvirt-qemu .../qemu-10... and sudo -u libvirt-qemu touch ...qemu-10.../test.file.

We added support and steps for running lxd VMS inside a container here, so nay help with what you’re trying to do too:

Good morning and thanks for the quick response.

Not sure it does help… a little background will be more useful… I need to run some appliances that do not support UEFI, so a good suggestion was to run them in KVM in a LXD container, which is what I’m trying to do. One is Dell OpenMange Enterprise which is internal only, however there is a Citrix NetScaler VPX / vADC which is freeBSD based and will be out on the internet, which may not be a good idea in this setup.

Thanks.

Those steps show how to pass through the KVM and virtio devices for running VMS inside containers.

Thanks Thomas, I will give it a try in a mo and will update with the results.

Being a bit more clearer with my comment… I was worried about your security warning, so with the NetScaler being publicly accessible, I don’t think it’s the right solution for my scenario. It’s now looking like I need to get KVM working at the LXD host level, and hopefully they can co-exist because I don’t have spare hosts to dedicate to KVM.

My understanding is that this just exposes non namespaced entities (like KVM) to the container. So its no more insecure than running it in the host directly. Certainly not for the VM guest itself.

OK, here is a new error for a Ubuntu 22.04 container when trying to start it:

Unable to complete install: 'Unable to set XATTR trusted.libvirt.security.dac on /var/lib/libvirt/qemu/domain-1-ius01a-omel02/master-key.aes: Operation not permitted'

Traceback (most recent call last):
  File "/usr/share/virt-manager/virtManager/asyncjob.py", line 72, in cb_wrapper
    callback(asyncjob, *args, **kwargs)
  File "/usr/share/virt-manager/virtManager/createvm.py", line 2008, in _do_async_install
    installer.start_install(guest, meter=meter)
  File "/usr/share/virt-manager/virtinst/install/installer.py", line 695, in start_install
    domain = self._create_guest(
  File "/usr/share/virt-manager/virtinst/install/installer.py", line 637, in _create_guest
    domain = self.conn.createXML(initial_xml or final_xml, 0)
  File "/usr/lib/python3/dist-packages/libvirt.py", line 4400, in createXML
    raise libvirtError('virDomainCreateXML() failed')
libvirt.libvirtError: Unable to set XATTR trusted.libvirt.security.dac on /var/lib/libvirt/qemu/domain-1-ius01a-omel02/master-key.aes: Operation not permitted

The container config:

~$ lxc config show ius01a-kqvl01 --expanded
architecture: x86_64
config:
  image.architecture: amd64
  image.description: Ubuntu jammy amd64 (20220624_07:42)
  image.os: Ubuntu
  image.release: jammy
  image.serial: "20220624_07:42"
  image.type: squashfs
  image.variant: default
  security.nesting: "true"
  volatile.base_image: 1fe0cc770ad15b62ef2a74de59d1eb9c2ad0bc1fc2866e03e52a4d96639aa465
  volatile.cloud-init.instance-id: 9ec42e60-b028-4862-b5dc-9ecb30b4ddf2
  volatile.eth0.host_name: mac83ed5cd0
  volatile.eth0.hwaddr: 00:16:3e:d6:81:11
  volatile.eth0.last_state.created: "false"
  volatile.idmap.base: "0"
  volatile.idmap.current: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
  volatile.idmap.next: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
  volatile.last_state.idmap: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
  volatile.last_state.power: RUNNING
  volatile.uuid: 1094df5f-c423-4281-b57d-a5d072172cd6
devices:
  eth0:
    name: eth0
    nictype: macvlan
    parent: eno1
    type: nic
  kvm:
    source: /dev/kvm
    type: unix-char
  root:
    path: /
    pool: sp01
    type: disk
  vhost-net:
    source: /dev/vhost-net
    type: unix-char
  vhost-vsock:
    source: /dev/vhost-vsock
    type: unix-char
ephemeral: false
profiles:
- default
stateful: false
description: ""

Installed the packages:

apt install qemu-kvm libvirt-daemon-system libvirt-clients bridge-utils

This time I did not modify /etc/libvirt/qemu.conf and set remember_owner = 0 as they were not in your instructions. Tried this which worked… setting user, group and remember:

Don’t know from a security perspective whether the above is safe to do?

The next issue I’m facing is how to get a VM in KVM to be on the LAN and get a DHCP address from the DHCP server, and be reachable on the LAN.

Default profile:

devices:
  eth0:
    name: eth0
    nictype: macvlan
    parent: eno1
    type: nic

So a container gets the default:

network:
  version: 2
  ethernets:
    eth0:
      dhcp4: true
      dhcp-identifier: mac

If I follow either the basic or complex examples of creating a bridge from here:

netplan/examples at main · canonical/netplan · GitHub

and read the source code of the above page to see the XML for the KVM network config at line xml br0, I can see the VM try to get an IP and the DHCP responding but the VM doesn’t hear it:

dhcpd[4759]: DHCPDISCOVER from 52:54:00:92:a8:a4 (ius01a-omel01) via eth0
dhcpd[4759]: DHCPOFFER on 10.0.0.172 to 52:54:00:92:a8:a4 (ius01a-omel01) via eth0
dhcpd[4759]: DHCPDISCOVER from 52:54:00:92:a8:a4 (ius01a-omel01) via eth0
dhcpd[4759]: DHCPOFFER on 10.0.0.172 to 52:54:00:92:a8:a4 (ius01a-omel01) via eth0
dhcpd[4759]: DHCPDISCOVER from 52:54:00:92:a8:a4 (ius01a-omel01) via eth0
dhcpd[4759]: DHCPOFFER on 10.0.0.172 to 52:54:00:92:a8:a4 (ius01a-omel01) via eth0
dhcpd[4759]: DHCPDISCOVER from 52:54:00:92:a8:a4 (ius01a-omel01) via eth0
dhcpd[4759]: DHCPOFFER on 10.0.0.172 to 52:54:00:92:a8:a4 (ius01a-omel01) via eth0
dhcpd[4759]: DHCPDISCOVER from 52:54:00:92:a8:a4 (ius01a-omel01) via eth0

I’ve also tried adding a second macvlan interface (eth1) into the container for the bridging but the bridge interface has the same DHCP problem as as the VM running on it, here is the basic example but attached to the second interface, eth1:

network:
  version: 2
  ethernets:
    eth0:
      dhcp4: true
      dhcp-identifier: mac
    eth1:
      dhcp4: false
      dhcp-identifier: mac
  bridges:
    ius01abr0:
      dhcp4: true
      interfaces:
        - eth1

So really its just looking like the bridge cannot hear responses for some reason.

This is the closest thing I’ve found and sounds familiar, but it needs some translation:

Bridged network inside a container for VMs

Thanks