Regarding 'raw.qemu.conf' and the move to QMP

I took note of the raw.qemu.conf override being added last year as per:
https://github.com/lxc/lxd/pull/10512

I was excited to try using this to get macOS running in LXD successfully without using any big hammer approaches that aren’t acceptable for anything but a test environment:

I guess I waited too long because it seems as of LXD 5.10, and likely eariler versions, I can no longer use raw.qemu.conf to override the disks because the relevant sections (e.g. [drive "lxd_root"] and [device "dev-lxd_root"]) are no longer generated into the qemu.conf at:

/var/snap/lxd/common/lxd/logs/<instance_name>/qemu.conf

A quick look at the code seems to suggest that this is because the addition of block devices has been switched to use QMP instead. Is that right?

I do recall that being mentioned on my issue request:
https://github.com/lxc/lxd/issues/9766

I understand, and agree the move to QMP is a good one, in general. I do also wonder if there can still be some sort of consideration for overrides as the raw.qemu.conf option loses power due to more devices moving to QMP based configuration?

1 Like

I asked @stgraber about this and we still would like to support disk overrides for this config setting.
But we need to figure out some syntax that makes sense so we can somewhat easily merge that with the QMP calls we make.

However, at least for now, this would need to be something that the community works on, not something we take on ourselves.

Ok, thanks for the update.

I have worked through how to successfully build LXD on my end, so I am close to a point where I can attempt a change myself.

I am currently thinking something along the lines of:

devices:
  root:
    path: /
    pool: zfs
    type: disk
    qemu:
      driver: "sata"

With a change to addDriveConfig in lxd/instance/drivers/driver_qemu.go adding a bit of extra logic around the following block at line 3312:

   if media == "disk" {
      device["driver"] = "scsi-hd"
   } else if media == "cdrom" {
      device["driver"] = "scsi-cd"
   }

Any feedback on what kind of syntax you guys were thinking about would be much appreciated. I haven’t programmed in Go before, so it is a slow process for me to understand the structure of the code. For example, I haven’t figured out yet whether or not the device config shown above is available to the addDriveConfig function.

I think we’d like to keep things under raw.qemu.XYZ. This is because it means we don’t have to add logic to the per-device code and we can also trivially flag any instance which uses raw.qemu.XYZ or raw.lxc as doing something that we can’t support.

Now for QEMU, we have raw.qemu which is straight up command line arguments to the daemon, raw.qemu.conf which is INI-style config that’s merged with the qemu.conf config. I think we’d probably do something like raw.qemu.qmp with some kind of override syntax for the QMP calls we’re making.

It may be something like:

raw.qemu.qmp: |
  disk.root.driver: scsi-hd

Or something along those lines. Though a lot more thoughts must be put into it to support:

  • Simple overrides of fields on a QMP object (that’s the easy one), supporting adding/changing/removing properties
  • Adding complete additional objects
  • Disabling/removing objects

The syntax for raw.qemu.conf already supports most of those concepts, just not in a way that aligns with QMP’s syntax. So we’d want to replicate the behavior.

We expect that over the next few years, most of our configuration will move to QMP, making raw.qemu.conf effectively obsolete as usptream is trying to push people away from using their config file option.

That makes sense, thanks for the clarification. I understand what you are saying from a design perspective.

I suppose a new lxd/instance/drivers/driver_qemu_qmp_override.go might make sense? Or would a more generic approach along side of drivers/qmp/commands.go be better?

I will start with trying to figure out an approach to overrides for single fields on block devices. Maybe I can get that working and see how well it lines up with the QMP syntax.

Let me explore this idea more and see what I can come up with. Getting support for all three bullet points above might be a bit much for my first Go project, but I will see how things feel after getting a simpler block device override implemented.

@stgraber I am still trying to get a build environment fully working. Due to not wanting to tinker with the snap LXD running on my main development box I am trying to build and test a custom LXD inside of a build-lxd container instead. Here is the build-lxd container config:

anderson@anderson-ryzen9:~$ lxc config show build-lxd
architecture: x86_64
config:
  image.architecture: amd64
  image.description: ubuntu 22.04 LTS amd64 (release) (20230107)
  image.label: release
  image.os: ubuntu
  image.release: jammy
  image.serial: "20230107"
  image.type: squashfs
  image.version: "22.04"
  limits.cpu: "8"
  limits.memory: 8GiB
  security.nesting: "true"
  volatile.base_image: ed7509d7e83f29104ff6caa207140619a8b235f66b5997f1ed6c5e462617fb71
  volatile.cloud-init.instance-id: ee99847e-5a0f-4305-95cb-3595b1ece0ab
  volatile.eth0.host_name: vethd5e1b351
  volatile.eth0.hwaddr: 00:16:3e:12:ac:d5
  volatile.idmap.base: "0"
  volatile.idmap.current: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
  volatile.idmap.next: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
  volatile.last_state.idmap: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
  volatile.last_state.power: RUNNING
  volatile.last_state.ready: "false"
  volatile.uuid: 676e2613-997a-432e-a93c-8730ebca92c0
devices:
  kvm:
    gid: "109"
    path: /dev/kvm
    type: unix-char
  root:
    path: /
    pool: zfs
    size: 120GiB
    type: disk
  vhost-vsock:
    path: /dev/vhost-vsock
    type: unix-char
  vsock:
    path: /dev/vsock
    type: unix-char
ephemeral: false
profiles:
- default
stateful: false
description: ""

I have an LXD daemon, that I built, running inside build-lxd and working well enough that I have successfully imported a VM using lxc import. However, when I try to start the VM it is giving the following error:

Error: Failed setting up device via monitor: Failed sending file descriptor of "/proc/self/fd/25" for disk device "open-core-boot": No file descriptor supplied via SCM_RIGHTS

I also see the following in dmesg on the host system:

[10090471.837895] audit: type=1400 audit(1674515323.158:5455): apparmor="DENIED" operation="file_receive" namespace="root//lxd-build-lxd_<var-snap-lxd-common-lxd>" profile="lxd-macOS-catalina_</var/snap/lxd/common/lxd>" name="/var/snap/lxd/common/lxd/storage-pools/zfs/custom/default_open-core-boot/root.img" pid=9139 comm="qemu-system-x86" requested_mask="wr" denied_mask="wr" fsuid=1000999 ouid=1000000

I guess I am wondering if I can do what I am attempting without setting security.privileged=true on build-lxd, or at all?

Ok, adding the following raw.apparmor allows the VM to start:

  raw.apparmor: |
    /var/snap/lxd/common/lxd/storage-pools/zfs/custom/default_open-core-boot/** rwk,

Is this intended behavior or should LXD be adding an apparmor rule for each custom volume that is attached to the instance? I think this might only be an issue when running LXD inside of an unprivileged container?

Ok, I am probably missing something obvious here, but when I run make in my local git repo it says ‘LXD built successfully’ and I see it detect the .go file I changed. Yet, the new LXD binary clearly lacks the changes I made because with the following change:

diff --git a/lxd/instance/drivers/driver_qemu.go b/lxd/instance/drivers/driver_qemu.go
index 8f7caaebe..8f71f1599 100644
--- a/lxd/instance/drivers/driver_qemu.go
+++ b/lxd/instance/drivers/driver_qemu.go
@@ -3360,9 +3360,9 @@ func (d *qemu) addDriveConfig(bootIndexes map[string]int, driveConf deviceConfig
        device := map[string]string{
                "id":      fmt.Sprintf("%s%s", qemuDeviceIDPrefix, escapedDeviceName),
                "drive":   blockDev["node-name"].(string),
-               "bus":     "qemu_scsi.0",
-               "channel": "0",
-               "lun":     "1",
+               "bus":     "sata.4",
+               //"channel": "0",
+               //"lun":     "1",
                "serial":  fmt.Sprintf("%s%s", qemuBlockDevIDPrefix, escapedDeviceName),
        }
 
@@ -3372,6 +3372,7 @@ func (d *qemu) addDriveConfig(bootIndexes map[string]int, driveConf deviceConfig
 
        if media == "disk" {
                device["driver"] = "scsi-hd"
+               device["driver"] = "ide-hd"
        } else if media == "cdrom" {
                device["driver"] = "scsi-cd"
        }

I see this error when I try to start the VM:

Error: Failed setting up device via monitor: Failed adding block device for disk device "root": Failed adding device: Bus 'qemu_scsi.0' not found

I did specifically remove the SCSI bus from the VM via raw.qemu.conf because I wanted to make sure the code wasn’t still trying to attach the root disk to the SCSI bus.

Also, I have tried adding a d.logger.Info("...") line around the same code area and I don’t see that log message printed when I run lxd -d or in the instances qemu.log. Am I looking in the right place to see a logger message?

Also, I followed the steps here to setup by build environment:

https://linuxcontainers.org/lxd/docs/latest/installing/

Have you followed the steps here?

https://linuxcontainers.org/lxd/docs/master/installing/#install-lxd-from-source

Yes, that is the guide I originally followed while setting up my build container.

So, if I make the following change:

diff --git a/lxd/daemon.go b/lxd/daemon.go
index dd04391ae..68a3d090e 100644
--- a/lxd/daemon.go
+++ b/lxd/daemon.go
@@ -834,7 +834,7 @@ func (d *Daemon) init() error {
                mode = "mock"
        }
 
-       logger.Info("LXD is starting", logger.Ctx{"version": version.Version, "mode": mode, "path": shared.VarPath("")})
+       logger.Info("LXD is startinggggggggg", logger.Ctx{"version": version.Version, "mode": mode, "path": shared.VarPath("")})
 
        /* List of sub-systems to trace */
        trace := d.config.Trace

I see it after running make and restarting the daemon:

INFO   [2023-01-24T19:19:12Z] LXD is startinggggggggg                       mode=normal path=/var/lib/lxd version=5.10

However, the following change:

diff --git a/lxd/instance/drivers/qmp/commands.go b/lxd/instance/drivers/qmp/commands.go
index d176c6077..72e54e535 100644
--- a/lxd/instance/drivers/qmp/commands.go
+++ b/lxd/instance/drivers/qmp/commands.go
@@ -454,7 +454,7 @@ func (m *Monitor) AddBlockDevice(blockDev map[string]any, device map[string]stri
 
        err := m.AddDevice(device)
        if err != nil {
-               return fmt.Errorf("Failed adding device: %w", err)
+               return fmt.Errorf("Failed adding deviceeeeeeeee: %w", err)
        }
 
        revert.Success()

Doesn’t do anything after running make and restarting the daemon. I have also tried make nocache without success:

anderson@build-lxd:~$ lxc start macOS-catalina --console=vga
Error: Failed setting up device via monitor: Failed adding block device for disk device "root": Failed adding device: Bus 'qemu_scsi.0' not found
Try `lxc info --show-log macOS-catalina` for more info

As per my comment above the ‘qemu_scsi.0’ was explicitly removed from the VM configuration because I wanted to make sure my code change was taking effect and somewhere else in the code wasn’t still trying to use ‘qemu_scsi.0’.

More specifically, after my change above that removed the device bus key I don’t see how that value can possibly exist in the binary nor the error message:

anderson@build-lxd:~/git/lxd$ grep -rin qemu_scsi.0
anderson@build-lxd:~/git/lxd$ grep -rin qemu_scsi
lxd/instance/drivers/driver_qemu_config_test.go:194:	t.Run("qemu_scsi", func(t *testing.T) {
lxd/instance/drivers/driver_qemu_config_test.go:201:			[device "qemu_scsi"]
lxd/instance/drivers/driver_qemu_config_test.go:209:			[device "qemu_scsi"]
lxd/instance/drivers/driver_qemu_templates.go:259:		name:    `device "qemu_scsi"`,

@stgraber @tomp It looks like the issue was confusion due to multiple LXD daemons running including the snap version packaged with LXD images (both container and VM variants) and the one I was building from source and running manually.

I realized this after setting up the build environment from scratch in a Jammy VM only to encounter the same behavior. So, I figured something must be interfering and/or my built version of LXD wasn’t the one that I was interacting with using lxc commands. So, the simple solution for me was:

sudo snap stop lxd
sudo snap disable lxd

Maybe a note about this on the install from source guide would make sense to help others avoid this confusion if they try to develop a test LXD on the same machine where snap LXD is also running?

After doing that and re-importing the VM image I now see error messages that reflect my changes, but it seems one limitation to the QMP approach for adding disks is that SATA disks aren’t hot pluggable:

Error: Failed setting up device via monitor: Failed adding block device for disk device "root": Failed adding device: Bus 'sata.4' does not support hotplugging

So, I am not sure a raw.qemu.qmp value can work for changing the disk controller to SATA using QMP because there is a catch 22 that the QEMU process has to be running to add disks via QMP, but by then a SATA disk can’t be added because hot plugging isn’t supported.

Furthermore, adding the SATA disk via raw.qemu.conf isn’t a great option because then disk paths (e.g. /var/snap/lxd/common/lxd/storage-pools/<pool_name>/virtual-machines/<instance_name>/root.img) have to be hard coded into the instance configuration, which is ugly.

Any ideas? Maybe I am overlooking a way that SATA disks can be hotplugged in QEMU?

@stgraber, @amcduffee ,

Were you able to resolve this issue?

Thanks.
Jason

Could this patch work for LXD?

I found it on the KATA containers GitHub repo. They solved the hot-plug issue by adding a PCIE root port.

I know the preference is to have virtio SCSI devices, however I am trying to get the NXOS 9300v image running and I don’t have the ability to convert the image. This image must use SATA.

Of course I could be misinformed and maybe it is possible to convert a 3rd party image…

LXD VMs all have an existing pcie-root-port already.