VM using dedicated GPUs unable to boot after upgrade to 6.10 - QEMU now complains about having some group in multiple address spaces

Hello,

I have been running Incus on my current setup for the last 6 months without any hiccups, upgrading at every release since 6.3 and appreciate the constant additional features being added.

I have been using my RTX 3090 in a dedicated GPU setup where I can boot different set of VMs depending of my current needs (1 VM on Windows for the occasional gaming, 1 VM on Debian running some LLMs, 1 VM used for a corporate Windows environment, etc.) without any issue, but the upgrade to 6.10 seems to prevent all of my VMs to boot with the following info:

> incus info --show-log aiengine
Name: aiengine
Description: 
Status: STOPPED
Type: virtual-machine
Architecture: x86_64
Created: 2024/10/01 22:28 EDT
Last Used: 2025/02/23 04:05 EST

Log:

qemu-system-x86_64:/run/incus/aiengine/qemu.conf:299: vfio 0000:01:00.1: group 17 used in multiple address spaces

I presume that group 17 refers to IOMMU groups, where the GPU and its integrated audio device are being added. I then went down the rabbit hole of double checking if I missed something on my initial setup:

  • Kernel modules nouveau, nvidiafb and snd_hda_intel are properly blacklisted
  • grub cmdline properly contains the intel_iommu=on and iommu=pt
  • Both device IDs are present in the options vfio-pci

Here is an output from the VM with such setup I use the most

# incus config device show aiengine
root:
  path: /
  pool: incus-nvme
  size: 256GiB
  type: disk
rtx3090:
  gputype: physical
  pci: "0000:01:00.0"
  type: gpu
vtpm:
  path: /dev/tpm0
  type: tpm

I also tried to add a pci device for the integrated audio, but that did not solve my current issue.

Since there is now IOMMU support in VMs in 6.10, it may be my config that is too funky for being an invisible change. I am a bit at a loss now as to what would be my next troubleshooting step.

There are a couple of other threads reporting issue with the addition of IOMMU so we’re in the process of moving it under a config flag: Move IOMMU handling under configuration option by stgraber · Pull Request #1715 · lxc/incus · GitHub

There are also some workarounds floating around using raw.qemu.conf to temporarily yank out the IOMMU controller to get things back working.

The raw.qemu.conf workaround that I have found worked and going to 6.10.1 fixed it as well.

Thanks!