VFIO-PCI for physical nictype and vgpu mdev

,

Howdy,

I have tried several ways to give Incus instances, both VMs and containers, full control over some of the hardware in my machine. TL;DR: Debian-based MX Linux 23 using Liquorix kernel 6.3 on Xeon E3-1270v6 and Quadro P1000. The other pertinent hardware are two PCIe networking cards, one for ethernet and one for wireless.

System:
  Kernel: 6.3.9-1-liquorix-amd64 [6.3-9~mx23+1] arch: x86_64 bits: 64 compiler: gcc v: 12.2.0 parameters: audit=0
    intel_pstate=disable hpet=disable rcupdate.rcu_expedited=1
    BOOT_IMAGE=/BOOT-VOL/misty-mx-bootfs@/vmlinuz-6.3.9-1-liquorix-amd64
    root=ZFS=/ROOT-VOL/misty-mx-rootfs ro root=ZFS=misty-mx-root-pool/ROOT-VOL/misty-mx-rootfs
    quiet splash rd.driver.pre=vfio-pci intel_iommu=on init=/lib/systemd/systemd
  Desktop: KDE Plasma v: 5.27.5 wm: kwin_wayland vt: 2 dm: SDDM Distro: MX-23.2_KDE_x64 Libretto
    July 31 2023 base: Debian GNU/Linux 12 (bookworm)
Machine:
  Type: Desktop System: HP product: HP Z240 SFF Workstation v: N/A serial: <superuser required>
    Chassis: type: 4 serial: <superuser required>
  Mobo: HP model: 802E serial: <superuser required> UEFI: HP v: N51 Ver. 01.89 date: 07/28/2023
CPU:
  Info: model: Intel Xeon E3-1270 v6 bits: 64 type: MT MCP arch: Kaby Lake level: v3 note: check
    built: 2018 process: Intel 14nm family: 6 model-id: 0x9E (158) stepping: 9 microcode: 0xF4
  Topology: cpus: 1x cores: 4 tpc: 2 threads: 8 smt: enabled cache: L1: 256 KiB
    desc: d-4x32 KiB; i-4x32 KiB L2: 1024 KiB desc: 4x256 KiB L3: 8 MiB desc: 1x8 MiB
  Speed (MHz): avg: 985 high: 1400 min/max: 800/3801 boost: enabled scaling: driver: acpi-cpufreq
    governor: ondemand cores: 1: 1400 2: 1285 3: 800 4: 800 5: 1200 6: 800 7: 800 8: 800
    bogomips: 60798
  Flags: avx avx2 ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx
  Vulnerabilities:
  Type: itlb_multihit status: KVM: Split huge pages
  Type: l1tf mitigation: PTE Inversion; VMX: conditional cache flushes, SMT vulnerable
  Type: mds mitigation: Clear CPU buffers; SMT vulnerable
  Type: meltdown mitigation: PTI
  Type: mmio_stale_data mitigation: Clear CPU buffers; SMT vulnerable
  Type: retbleed mitigation: IBRS
  Type: spec_store_bypass mitigation: Speculative Store Bypass disabled via prctl
  Type: spectre_v1 mitigation: usercopy/swapgs barriers and __user pointer sanitization
  Type: spectre_v2 mitigation: IBRS, IBPB: conditional, STIBP: conditional, RSB filling,
    PBRSB-eIBRS: Not affected
  Type: srbds mitigation: Microcode
  Type: tsx_async_abort mitigation: TSX disabled
Graphics:
  Device-1: NVIDIA GP107GL [Quadro P1000] driver: nvidia v: 535.129.03
    alternate: nouveau,nvidia_drm,nvidia_vgpu_vfio non-free: 530.xx+ status: current (as of 2023-03)
    arch: Pascal code: GP10x process: TSMC 16nm built: 2016-21 pcie: gen: 3 speed: 8 GT/s lanes: 16
    ports: active: none off: DP-3 empty: DP-1,DP-2,DP-4 bus-ID: 01:00.0 chip-ID: 10de:1cb1
    class-ID: 0300
  Display: wayland server: X.org v: 1.21.1.7 with: Xwayland v: 22.1.9 compositor: kwin_wayland
    driver: X: loaded: nvidia gpu: nvidia display-ID: 0
  Monitor-1: DP-3 res: 1280x1024 size: N/A modes: N/A
  API: OpenGL v: 4.6.0 NVIDIA 535.129.03 renderer: Quadro P1000/PCIe/SSE2 direct-render: Yes
Audio:
  Device-1: Intel 100 Series/C230 Series Family HD Audio vendor: Hewlett-Packard
    driver: snd_hda_intel v: kernel alternate: snd_soc_avs bus-ID: 00:1f.3 chip-ID: 8086:a170
    class-ID: 0403
  Device-2: NVIDIA GP107GL High Definition Audio driver: snd_hda_intel v: kernel pcie: gen: 3
    speed: 8 GT/s lanes: 16 bus-ID: 01:00.1 chip-ID: 10de:0fb9 class-ID: 0403
  API: ALSA v: k6.3.9-1-liquorix-amd64 status: kernel-api tools: alsamixer,amixer
  Server-1: PipeWire v: 1.0.0 status: active with: 1: pipewire-pulse status: active
    2: wireplumber status: active 3: pipewire-alsa type: plugin 4: pw-jack type: plugin
    tools: pactl,pw-cat,pw-cli,wpctl
Network:
  Device-1: Intel Ethernet I219-LM vendor: Hewlett-Packard driver: e1000e v: kernel port: N/A
    bus-ID: 00:1f.6 chip-ID: 8086:15b7 class-ID: 0200
  IF: eth1 state: down mac: <filter>
  Device-2: Realtek RTL8125 2.5GbE driver: r8169 v: kernel pcie: gen: 2 speed: 5 GT/s lanes: 1
    port: 3000 bus-ID: 03:00.0 chip-ID: 10ec:8125 class-ID: 0200
  IF: eth0 state: up speed: 100 Mbps duplex: full mac: <filter>
  Device-3: Qualcomm Atheros QCA9377 802.11ac Wireless Network Adapter vendor: Dell
    driver: ath10k_pci v: kernel modules: wl pcie: gen: 1 speed: 2.5 GT/s lanes: 1 bus-ID: 04:00.0
    chip-ID: 168c:0042 class-ID: 0280
  IF: wlan0 state: down mac: <filter>
  IF-ID-1: anvilbr0 state: up speed: 10000 Mbps duplex: unknown mac: <filter>
  IF-ID-2: tap750e3856 state: up speed: 10000 Mbps duplex: full mac: <filter>
Bluetooth:
  Device-1: Qualcomm Atheros type: USB driver: btusb v: 0.8 bus-ID: 1-12:3 chip-ID: 0cf3:e009
    class-ID: e001
  Report: hciconfig ID: hci0 rfk-id: 1 state: down bt-service: enabled,running rfk-block:
    hardware: no software: no address: <filter>
  Info: acl-mtu: 1024:8 sco-mtu: 50:8 link-policy: rswitch hold sniff
    link-mode: peripheral accept
RAID:
  Device-1: backup-bottom-pool type: zfs status: ONLINE level: linear raw: size: 29 GiB
    free: 29 GiB allocated: 1.9 MiB zfs-fs: size: 28.09 GiB free: 28.09 GiB
  Components: Online:
  1: sdd4 maj-min: 8:52 size: 29.3 GiB
  Device-2: lakitu-pool type: zfs status: ONLINE level: linear raw: size: 1.86 TiB free: 1.81 TiB
    allocated: 59.8 GiB zfs-fs: size: 1.8 TiB free: 1.74 TiB
  Components: Online:
  1: sda1 maj-min: 8:1 size: 1.86 TiB
  Device-3: misty-incus-backup-pool type: zfs status: ONLINE level: linear raw: size: 29 GiB
    free: 29 GiB allocated: 600 KiB zfs-fs: size: 28.09 GiB free: 28.09 GiB
  Components: Online:
  1: nvme1n1p6 maj-min: 259:6 size: 29.3 GiB
  Device-4: misty-mx-boot-pool type: zfs status: ONLINE level: linear raw: size: 960 MiB
    free: 542 MiB allocated: 418 MiB zfs-fs: size: 831.7 MiB free: 413.8 MiB
  Components: Online:
  1: nvme1n1p4 maj-min: 259:4 size: 1000 MiB
  Device-5: misty-mx-root-pool type: zfs status: ONLINE level: linear raw: size: 48.5 GiB
    free: 38.3 GiB allocated: 10.2 GiB zfs-fs: size: 47 GiB free: 36.81 GiB
  Components: Online:
  1: nvme1n1p5 maj-min: 259:5 size: 48.83 GiB
  Device-6: special type: zfs status: - level: mirror-1 zfs-fs: size: 1.8 TiB free: raw:
    size: 5.86 GiB free: 5.5 GiB allocated: 578 KiB
  Components: Online:
  1: nvme0n1p2 maj-min: 259:9 size: 5.86 GiB
  2: nvme1n1p2 maj-min: 259:2 size: 5.86 GiB
  Device-7: special type: zfs status: - level: mirror-2 zfs-fs: size: 94 GiB free: raw:
    size: 22.5 GiB free: 22 GiB allocated: 513 KiB
  Components: Online:
  1: nvme0n1p1 maj-min: 259:8 size: 22.46 GiB
  2: nvme1n1p1 maj-min: 259:1 size: 22.46 GiB
  Device-8: zippy-top-pool type: zfs status: ONLINE level: linear raw: size: 29 GiB
    free: 23.4 GiB allocated: 5.6 GiB zfs-fs: size: 28.09 GiB free: 22.5 GiB
  Components: Online:
  1: nvme0n1p3 maj-min: 259:10 size: 29.3 GiB
  Device-9: zkiff-pool type: zfs status: ONLINE level: linear raw: size: 119 GiB free: 119 GiB
    allocated: 558 KiB zfs-fs: size: 94 GiB free: 94 GiB
  Components: Online:
  1: sdb1 maj-min: 8:17 size: 48.83 GiB
  2: sdc1 maj-min: 8:33 size: 48.83 GiB
Drives:
  Local Storage: total: raw: 13.36 TiB usable: 13.32 TiB used: 69.45 GiB (0.5%)
  SMART Message: Unable to run smartctl. Root privileges required.
  ID-1: /dev/nvme0n1 maj-min: 259:7 vendor: SSSTC model: CL1-3D128-Q11 NVMe SSSTC 128GB
    size: 119.24 GiB block-size: physical: 512 B logical: 512 B speed: 31.6 Gb/s lanes: 4 type: SSD
    serial: <filter> rev: 22301116 temp: 36.9 C scheme: GPT
  ID-2: /dev/nvme1n1 maj-min: 259:0 vendor: Intel model: SSDPEKKW512G7 size: 476.94 GiB
    block-size: physical: 512 B logical: 512 B speed: 31.6 Gb/s lanes: 4 type: SSD serial: <filter>
    rev: PSF109C temp: 32.9 C scheme: GPT
  ID-3: /dev/sda maj-min: 8:0 vendor: TeamGroup model: T2532TB size: 1.86 TiB block-size:
    physical: 512 B logical: 512 B speed: 6.0 Gb/s type: SSD serial: <filter> rev: 8A0 scheme: GPT
  ID-4: /dev/sdb maj-min: 8:16 vendor: Seagate model: ST4000NM0033-9ZM170 size: 3.64 TiB
    block-size: physical: 512 B logical: 512 B speed: 6.0 Gb/s type: HDD rpm: 7200 serial: <filter>
    rev: SN07 scheme: GPT
  ID-5: /dev/sdc maj-min: 8:32 vendor: Seagate model: ST4000NM0033-9ZM170 size: 3.64 TiB
    block-size: physical: 512 B logical: 512 B speed: 6.0 Gb/s type: HDD rpm: 7200 serial: <filter>
    rev: SN07 scheme: GPT
  ID-6: /dev/sdd maj-min: 8:48 vendor: Seagate model: ST4000LM024-2U817V size: 3.64 TiB
    block-size: physical: 4096 B logical: 512 B speed: 3.0 Gb/s type: HDD rpm: 5400 serial: <filter>
    rev: SPS5 scheme: GPT
Partition:
  ID-1: / raw-size: N/A size: 44.29 GiB used: 7.48 GiB (16.9%) fs: zfs
    logical: misty-mx-root-pool/ROOT-VOL/misty-mx-rootfs
  ID-2: /boot raw-size: N/A size: 538.5 MiB used: 124.8 MiB (23.2%) fs: zfs
    logical: misty-mx-boot-pool/BOOT-VOL/misty-mx-bootfs
  ID-3: /boot/efi raw-size: 500 MiB size: 499 MiB (99.80%) used: 512 KiB (0.1%) fs: vfat
    dev: /dev/nvme1n1p3 maj-min: 259:3
  ID-4: /home raw-size: N/A size: 38.62 GiB used: 1.81 GiB (4.7%) fs: zfs
    logical: misty-mx-root-pool/HOME-VOL/home
  ID-5: /var/log raw-size: N/A size: 22.56 GiB used: 68.1 MiB (0.3%) fs: zfs
    logical: zippy-top-pool/CACHE-VOL/misty-mx-vlog
Swap:
  Alert: No swap data was found.
Sensors:
  System Temperatures: cpu: 51.0 C pch: 87.0 C mobo: N/A
  Fan Speeds (RPM): N/A
Repos:
  Packages: pm: dpkg pkgs: 2429 libs: 1294 tools: apt,apt-get,aptitude,nala pm: rpm pkgs: 0
    pm: flatpak pkgs: 0
  No active apt repos in: /etc/apt/sources.list
  Active apt repos in: /etc/apt/sources.list.d/debian-stable-updates.list
    1: deb http://mirror.keystealth.org/debian bookworm-updates main contrib non-free non-free-firmware
  Active apt repos in: /etc/apt/sources.list.d/debian.list
    1: deb http://mirror.keystealth.org/debian bookworm main contrib non-free non-free-firmware
    2: deb http://security.debian.org/debian-security bookworm-security main contrib non-free non-free-firmware
  Active apt repos in: /etc/apt/sources.list.d/librewolf.list
    1: deb [arch=amd64] http://deb.librewolf.net bullseye main
  Active apt repos in: /etc/apt/sources.list.d/mx.list
    1: deb http://la.mxrepo.com/mx/repo/ bookworm main non-free
    2: deb http://la.mxrepo.com/mx/repo/ bookworm ahs
  Active apt repos in: /etc/apt/sources.list.d/rdiffweb.list
    1: deb [arch=amd64 signed-by=/usr/share/keyrings/rdiffweb-keyring.gpg] https://nexus.ikus-soft.com/repository/apt-release-bookworm/ bookworm main
  Active apt repos in: /etc/apt/sources.list.d/zrepl.list
    1: deb [arch=amd64 signed-by=/usr/share/keyrings/zrepl.gpg] https://zrepl.cschwarz.com/apt/debian bookworm main
  Active apt repos in: /etc/apt/sources.list.d/zabbly-incus-stable.sources
    1: deb [arch=amd64] https://pkgs.zabbly.com/incus/stable bookworm main
Info:
  Processes: 646 Uptime: 19m wakeups: 0 Memory: 62.73 GiB used: 4.4 GiB (7.0%) Init: systemd v: 252
  default: graphical tool: systemctl Compilers: gcc: 12.2.0 alt: 12 Client: shell wrapper
  v: 5.2.15-release inxi: 3.3.26
Boot Mode: UEFI

I followed this guide to boot from ZFS, then this guide to install my GPU driver. The driver I’m using is ‘merged,’ meaning that it should simultaneously support host usage desktop applications, as well as the KVM NVIDIA vGPU feature, which uses VFIO to share the card’s resources with virtual machines. I intend to use Incus to spin up a couple of VMs with just a little video RAM for AI, transcoding, and whatever. Eventually, it might also be nice to set up SR-IOV on the host GPU driver with an Arch container for gaming.
I would like a container to ‘own’ the wireless adapter and the onboard ethernet adapter, as well as run the license server, then share the GPU with some VMs via mdev/vgpu. There are myriad snippets online for enabling vfio-pci–or, vfio_pci–I can’t tell which, anymore. I found similar posts for AMD GPUs on the MX Forum and with mkinitcpio on the Arch Forum, but I still don’t see the vfio-pci kernel driver in use for any of these devices on my machine! Perhaps Incus works some magic that doesn’t require this to be enabled? Every time I have tried a physical nictype, the VM/container has been unable to use the interface–or, at least, could not detect any wireless adapter. If it’s magic, my machine seems not to be under the spell.

$ lspci -nnvvk
00:00.0 Host bridge [0600]: Intel Corporation Xeon E3-1200 v6/7th Gen Core Processor Host Bridge/DRAM Registers [8086:5918] (rev 05)
	Subsystem: Hewlett-Packard Company Xeon E3-1200 v6/7th Gen Core Processor Host Bridge/DRAM Registers [103c:802e]
	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort+ >SERR- <PERR- INTx-
	Latency: 0
	IOMMU group: 0
	Capabilities: <access denied>
	Kernel driver in use: skl_uncore
	Kernel modules: ie31200_edac

00:01.0 PCI bridge [0604]: Intel Corporation 6th-10th Gen Core Processor PCIe Controller (x16) [8086:1901] (rev 05) (prog-if 00 [Normal decode])
	Subsystem: Hewlett-Packard Company 6th-10th Gen Core Processor PCIe Controller (x16) [103c:802e]
	Physical Slot: 1
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Interrupt: pin A routed to IRQ 121
	IOMMU group: 1
	Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
	I/O behind bridge: 4000-4fff [size=4K] [16-bit]
	Memory behind bridge: d2000000-d30fffff [size=17M] [32-bit]
	Prefetchable memory behind bridge: c0000000-d1ffffff [size=288M] [32-bit]
	Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort+ <SERR- <PERR-
	BridgeCtl: Parity- SERR+ NoISA- VGA+ VGA16+ MAbort- >Reset- FastB2B-
		PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
	Capabilities: <access denied>
	Kernel driver in use: pcieport

00:14.0 USB controller [0c03]: Intel Corporation 100 Series/C230 Series Chipset Family USB 3.0 xHCI Controller [8086:a12f] (rev 31) (prog-if 30 [XHCI])
	Subsystem: Hewlett-Packard Company 100 Series/C230 Series Chipset Family USB 3.0 xHCI Controller [103c:802e]
	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Interrupt: pin A routed to IRQ 146
	IOMMU group: 2
	Region 0: Memory at d3720000 (64-bit, non-prefetchable) [size=64K]
	Capabilities: <access denied>
	Kernel driver in use: xhci_hcd
	Kernel modules: xhci_pci

00:14.2 Signal processing controller [1180]: Intel Corporation 100 Series/C230 Series Chipset Family Thermal Subsystem [8086:a131] (rev 31)
	Subsystem: Hewlett-Packard Company 100 Series/C230 Series Chipset Family Thermal Subsystem [103c:802e]
	Control: I/O- Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Interrupt: pin C routed to IRQ 18
	IOMMU group: 2
	Region 0: Memory at d374a000 (64-bit, non-prefetchable) [size=4K]
	Capabilities: <access denied>
	Kernel driver in use: intel_pch_thermal
	Kernel modules: intel_pch_thermal

00:16.0 Communication controller [0780]: Intel Corporation 100 Series/C230 Series Chipset Family MEI Controller #1 [8086:a13a] (rev 31)
	Subsystem: Hewlett-Packard Company 100 Series/C230 Series Chipset Family MEI Controller [103c:802e]
	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Interrupt: pin A routed to IRQ 126
	IOMMU group: 3
	Region 0: Memory at d374b000 (64-bit, non-prefetchable) [size=4K]
	Capabilities: <access denied>
	Kernel driver in use: mei_me

00:16.3 Serial controller [0700]: Intel Corporation 100 Series/C230 Series Chipset Family KT Redirection [8086:a13d] (rev 31) (prog-if 02 [16550])
	Subsystem: Hewlett-Packard Company 100 Series/C230 Series Chipset Family KT Redirection [103c:802e]
	Control: I/O+ Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Interrupt: pin D routed to IRQ 19
	IOMMU group: 3
	Region 0: I/O ports at 5040 [size=8]
	Region 1: Memory at d374f000 (32-bit, non-prefetchable) [size=4K]
	Capabilities: <access denied>
	Kernel driver in use: serial

00:17.0 SATA controller [0106]: Intel Corporation Q170/Q150/B150/H170/H110/Z170/CM236 Chipset SATA Controller [AHCI Mode] [8086:a102] (rev 31) (prog-if 01 [AHCI 1.0])
	Subsystem: Hewlett-Packard Company Q170/Q150/B150/H170/H110/Z170/CM236 Chipset SATA Controller [AHCI Mode] [103c:802e]
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
	Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Interrupt: pin A routed to IRQ 127
	IOMMU group: 4
	Region 0: Memory at d3748000 (32-bit, non-prefetchable) [size=8K]
	Region 1: Memory at d374e000 (32-bit, non-prefetchable) [size=256]
	Region 2: I/O ports at 5048 [size=8]
	Region 3: I/O ports at 5050 [size=4]
	Region 4: I/O ports at 5020 [size=32]
	Region 5: Memory at d374c000 (32-bit, non-prefetchable) [size=2K]
	Capabilities: <access denied>
	Kernel driver in use: ahci

00:1b.0 PCI bridge [0604]: Intel Corporation 100 Series/C230 Series Chipset Family PCI Express Root Port #17 [8086:a167] (rev f1) (prog-if 00 [Normal decode])
	Subsystem: Hewlett-Packard Company 100 Series/C230 Series Chipset Family PCI Express Root Port [103c:802e]
	Physical Slot: 5
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Interrupt: pin A routed to IRQ 122
	IOMMU group: 5
	Bus: primary=00, secondary=02, subordinate=02, sec-latency=0
	I/O behind bridge: [disabled] [16-bit]
	Memory behind bridge: d3600000-d36fffff [size=1M] [32-bit]
	Prefetchable memory behind bridge: [disabled] [64-bit]
	Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort+ <SERR- <PERR-
	BridgeCtl: Parity- SERR+ NoISA- VGA- VGA16- MAbort- >Reset- FastB2B-
		PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
	Capabilities: <access denied>
	Kernel driver in use: pcieport

00:1c.0 PCI bridge [0604]: Intel Corporation 100 Series/C230 Series Chipset Family PCI Express Root Port #6 [8086:a115] (rev f1) (prog-if 00 [Normal decode])
	Subsystem: Hewlett-Packard Company 100 Series/C230 Series Chipset Family PCI Express Root Port [103c:802e]
	Physical Slot: 31
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Interrupt: pin B routed to IRQ 123
	IOMMU group: 6
	Bus: primary=00, secondary=03, subordinate=03, sec-latency=0
	I/O behind bridge: 3000-3fff [size=4K] [16-bit]
	Memory behind bridge: d3500000-d35fffff [size=1M] [32-bit]
	Prefetchable memory behind bridge: [disabled] [64-bit]
	Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort+ <SERR- <PERR-
	BridgeCtl: Parity- SERR+ NoISA- VGA- VGA16- MAbort- >Reset- FastB2B-
		PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
	Capabilities: <access denied>
	Kernel driver in use: pcieport

00:1c.6 PCI bridge [0604]: Intel Corporation 100 Series/C230 Series Chipset Family PCI Express Root Port #7 [8086:a116] (rev f1) (prog-if 00 [Normal decode])
	Subsystem: Hewlett-Packard Company 100 Series/C230 Series Chipset Family PCI Express Root Port [103c:802e]
	Physical Slot: 31
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Interrupt: pin C routed to IRQ 124
	IOMMU group: 7
	Bus: primary=00, secondary=04, subordinate=04, sec-latency=0
	I/O behind bridge: [disabled] [16-bit]
	Memory behind bridge: d3200000-d33fffff [size=2M] [32-bit]
	Prefetchable memory behind bridge: [disabled] [64-bit]
	Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort+ <SERR- <PERR-
	BridgeCtl: Parity- SERR+ NoISA- VGA- VGA16- MAbort- >Reset- FastB2B-
		PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
	Capabilities: <access denied>
	Kernel driver in use: pcieport

00:1d.0 PCI bridge [0604]: Intel Corporation 100 Series/C230 Series Chipset Family PCI Express Root Port #9 [8086:a118] (rev f1) (prog-if 00 [Normal decode])
	Subsystem: Hewlett-Packard Company 100 Series/C230 Series Chipset Family PCI Express Root Port [103c:802e]
	Physical Slot: 4
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Interrupt: pin A routed to IRQ 125
	IOMMU group: 8
	Bus: primary=00, secondary=05, subordinate=05, sec-latency=0
	I/O behind bridge: [disabled] [16-bit]
	Memory behind bridge: d3400000-d34fffff [size=1M] [32-bit]
	Prefetchable memory behind bridge: [disabled] [64-bit]
	Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort+ <SERR- <PERR-
	BridgeCtl: Parity- SERR+ NoISA- VGA- VGA16- MAbort- >Reset- FastB2B-
		PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
	Capabilities: <access denied>
	Kernel driver in use: pcieport

00:1f.0 ISA bridge [0601]: Intel Corporation C236 Chipset LPC/eSPI Controller [8086:a149] (rev 31)
	Subsystem: Hewlett-Packard Company C236 Chipset LPC/eSPI Controller [103c:802e]
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
	Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	IOMMU group: 9

00:1f.2 Memory controller [0580]: Intel Corporation 100 Series/C230 Series Chipset Family Power Management Controller [8086:a121] (rev 31)
	Subsystem: Hewlett-Packard Company 100 Series/C230 Series Chipset Family Power Management Controller [103c:802e]
	Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	IOMMU group: 9
	Region 0: Memory at d3744000 (32-bit, non-prefetchable) [disabled] [size=16K]

00:1f.3 Audio device [0403]: Intel Corporation 100 Series/C230 Series Chipset Family HD Audio Controller [8086:a170] (rev 31)
	Subsystem: Hewlett-Packard Company 100 Series/C230 Series Chipset Family HD Audio Controller [103c:802e]
	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 64
	Interrupt: pin A routed to IRQ 149
	IOMMU group: 9
	Region 0: Memory at d3740000 (64-bit, non-prefetchable) [size=16K]
	Region 4: Memory at d3730000 (64-bit, non-prefetchable) [size=64K]
	Capabilities: <access denied>
	Kernel driver in use: snd_hda_intel
	Kernel modules: snd_hda_intel, snd_soc_avs

00:1f.4 SMBus [0c05]: Intel Corporation 100 Series/C230 Series Chipset Family SMBus [8086:a123] (rev 31)
	Subsystem: Hewlett-Packard Company 100 Series/C230 Series Chipset Family SMBus [103c:802e]
	Control: I/O+ Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
	Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Interrupt: pin A routed to IRQ 16
	IOMMU group: 9
	Region 0: Memory at d374d000 (64-bit, non-prefetchable) [size=256]
	Region 4: I/O ports at efa0 [size=32]
	Kernel driver in use: i801_smbus
	Kernel modules: i2c_i801

00:1f.6 Ethernet controller [0200]: Intel Corporation Ethernet Connection (2) I219-LM [8086:15b7] (rev 31)
	DeviceName: Onboard Lan
	Subsystem: Hewlett-Packard Company Ethernet Connection (2) I219-LM [103c:802e]
	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Interrupt: pin A routed to IRQ 147
	IOMMU group: 10
	Region 0: Memory at d3700000 (32-bit, non-prefetchable) [size=128K]
	Capabilities: <access denied>
	Kernel driver in use: e1000e
	Kernel modules: e1000e

01:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP107GL [Quadro P1000] [10de:1cb1] (rev a1) (prog-if 00 [VGA controller])
	Subsystem: NVIDIA Corporation GP107GL [Quadro P1000] [10de:11bc]
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Interrupt: pin A routed to IRQ 151
	IOMMU group: 1
	Region 0: Memory at d2000000 (32-bit, non-prefetchable) [size=16M]
	Region 1: Memory at c0000000 (64-bit, prefetchable) [size=256M]
	Region 3: Memory at d0000000 (64-bit, prefetchable) [size=32M]
	Region 5: I/O ports at 4000 [size=128]
	Expansion ROM at 000c0000 [virtual] [disabled] [size=128K]
	Capabilities: <access denied>
	Kernel driver in use: nvidia
	Kernel modules: nouveau, nvidia_drm, nvidia_vgpu_vfio, nvidia

01:00.1 Audio device [0403]: NVIDIA Corporation GP107GL High Definition Audio Controller [10de:0fb9] (rev a1)
	Subsystem: NVIDIA Corporation GP107GL High Definition Audio Controller [10de:11bc]
	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Interrupt: pin B routed to IRQ 17
	IOMMU group: 1
	Region 0: Memory at d3000000 (32-bit, non-prefetchable) [size=16K]
	Capabilities: <access denied>
	Kernel driver in use: snd_hda_intel
	Kernel modules: snd_hda_intel

02:00.0 Non-Volatile memory controller [0108]: Intel Corporation SSD 600P Series [8086:f1a5] (rev 03) (prog-if 02 [NVM Express])
	Subsystem: Intel Corporation SSDPEKKW256G7 256GB [8086:390a]
	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Interrupt: pin A routed to IRQ 16
	IOMMU group: 11
	Region 0: Memory at d3600000 (64-bit, non-prefetchable) [size=16K]
	Capabilities: <access denied>
	Kernel driver in use: nvme

03:00.0 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd. RTL8125 2.5GbE Controller [10ec:8125] (rev 05)
	Subsystem: Realtek Semiconductor Co., Ltd. RTL8125 2.5GbE Controller [10ec:0123]
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0, Cache Line Size: 64 bytes
	Interrupt: pin A routed to IRQ 17
	IOMMU group: 12
	Region 0: I/O ports at 3000 [size=256]
	Region 2: Memory at d3500000 (64-bit, non-prefetchable) [size=64K]
	Region 4: Memory at d3510000 (64-bit, non-prefetchable) [size=16K]
	Expansion ROM at d3520000 [disabled] [size=64K]
	Capabilities: <access denied>
	Kernel driver in use: r8169
	Kernel modules: r8169

04:00.0 Network controller [0280]: Qualcomm Atheros QCA9377 802.11ac Wireless Network Adapter [168c:0042] (rev 31)
	Subsystem: Dell QCA9377 802.11ac Wireless Network Adapter [1028:1810]
	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Interrupt: pin A routed to IRQ 150
	IOMMU group: 13
	Region 0: Memory at d3200000 (64-bit, non-prefetchable) [size=2M]
	Capabilities: <access denied>
	Kernel driver in use: ath10k_pci
	Kernel modules: ath10k_pci, wl

05:00.0 Non-Volatile memory controller [0108]: Solid State Storage Technology Corporation Device [1e95:9100] (rev 03) (prog-if 02 [NVM Express])
	Subsystem: Silicon Motion, Inc. Device [126f:2263]
	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Interrupt: pin A routed to IRQ 16
	IOMMU group: 14
	Region 0: Memory at d3400000 (64-bit, non-prefetchable) [size=16K]
	Capabilities: <access denied>
	Kernel driver in use: nvme

First, I enabled VT-d and VT-x in the UEFI Firmware Settings, and added rd.driver.pre=vfio-pci intel_iommu=on to my grub switches. I am confident this works because I can spin up VMs and there are IOMMU groups visible in the above lspci output. I added kernel configuration file:

$ cat /etc/modprobe.d/vfio.conf 
# create new : for [ids=***], specify [Vendor-ID : Device-ID]
#wifi,ethernet,gpu
options vfio-pci ids=1028:1810,103c:802e,10de:11bc
#options vfio-pci ids=168c:0042,8086:15b7,10de:1cb1
softdep nvidia pre: vfio-pci
options nvidia-drm modeset=1

Just in case, I ran sudo update-initramfs -u -k all; sudo update-grub, and rebooted. There were reboots all over this process. No matter what I do, though, I can’t get vfio-pci take any of the cards. There is a commented-out line where I tried other IDs. Commenting out the last two lines seems not to have any effect on the kernel drivers used, except that I get no GUI without the final line. I’ve also tried using vfio_pci instead of vfio-pci in the vfio.conf. The only other clue I can think of is dmesg:

$ sudo dmesg | grep vfio
[    0.000000] Command line: BOOT_IMAGE=/BOOT-VOL/misty-mx-bootfs@/vmlinuz-6.3.9-1-liquorix-amd64 root=ZFS=/ROOT-VOL/misty-mx-rootfs ro root=ZFS=misty-mx-root-pool/ROOT-VOL/misty-mx-rootfs quiet splash rd.driver.pre=vfio-pci intel_iommu=on init=/lib/systemd/systemd
[    0.015725] Kernel command line: audit=0 intel_pstate=disable hpet=disable rcupdate.rcu_expedited=1  BOOT_IMAGE=/BOOT-VOL/misty-mx-bootfs@/vmlinuz-6.3.9-1-liquorix-amd64 root=ZFS=/ROOT-VOL/misty-mx-rootfs ro root=ZFS=misty-mx-root-pool/ROOT-VOL/misty-mx-rootfs quiet splash rd.driver.pre=vfio-pci intel_iommu=on init=/lib/systemd/systemd
[    4.153782] vfio_pci: add [1028:1810[ffffffff:ffffffff]] class 0x000000/00000000
[    4.153788] vfio_pci: add [103c:802e[ffffffff:ffffffff]] class 0x000000/00000000
[    4.153793] vfio_pci: add [10de:11bc[ffffffff:ffffffff]] class 0x000000/00000000

I would like to avoid hacky solutions, like posts I’ve seen with scripts that run after boot to unbind and re-bind the drivers. The least of these evils I’ve seen is in this reddit post. It just seems wonky to install another package to sort out the order of loading drivers/modules.
I appreciate any input you may have, and help, doubly.

Much obliged,

UpsetMOSFET

Can you show incus info --resources to see what Incus sees for that GPU?

Wow, you’re fast! I rearranged the output to put the GPU and NICs at the top:

$ incus info --resources
GPU:
  NUMA node: 0
  Vendor: NVIDIA Corporation (10de)
  Product: GP107GL [Quadro P1000] (1cb1)
  PCI address: 0000:01:00.0
  Driver: nvidia (535.129.03)
  DRM:
    ID: 0
    Card: card0 (226:0)
    Control: controlD64 (226:0)
    Render: renderD128 (226:128)
  NVIDIA information:
    Architecture: 6.1
    Brand: GeForce
    Model: Quadro P1000
    CUDA Version: 12.2
    NVRM Version: 535.129.03
    UUID: GPU-b30d2560-69ff-6298-1475-d42760ae0e28
  Mdev profiles:
    - nvidia-156 (GRID P40-2B) (12 available)
        num_heads=4, frl_config=45, framebuffer=2048M, max_resolution=5120x2880, max_instance=12
    - nvidia-215 (GRID P40-2B4) (12 available)
        num_heads=4, frl_config=45, framebuffer=2048M, max_resolution=5120x2880, max_instance=12
    - nvidia-241 (GRID P40-1B4) (24 available)
        num_heads=4, frl_config=45, framebuffer=1024M, max_resolution=5120x2880, max_instance=24
    - nvidia-283 (GRID P40-4C) (6 available)
        num_heads=1, frl_config=60, framebuffer=4096M, max_resolution=4096x2400, max_instance=6
    - nvidia-284 (GRID P40-6C) (4 available)
        num_heads=1, frl_config=60, framebuffer=6144M, max_resolution=4096x2400, max_instance=4
    - nvidia-285 (GRID P40-8C) (3 available)
        num_heads=1, frl_config=60, framebuffer=8192M, max_resolution=4096x2400, max_instance=3
    - nvidia-286 (GRID P40-12C) (2 available)
        num_heads=1, frl_config=60, framebuffer=12288M, max_resolution=4096x2400, max_instance=2
    - nvidia-287 (GRID P40-24C) (1 available)
        num_heads=1, frl_config=60, framebuffer=24576M, max_resolution=4096x2400, max_instance=1
    - nvidia-46 (GRID P40-1Q) (24 available)
        num_heads=4, frl_config=60, framebuffer=1024M, max_resolution=5120x2880, max_instance=24
    - nvidia-47 (GRID P40-2Q) (12 available)
        num_heads=4, frl_config=60, framebuffer=2048M, max_resolution=7680x4320, max_instance=12
    - nvidia-48 (GRID P40-3Q) (8 available)
        num_heads=4, frl_config=60, framebuffer=3072M, max_resolution=7680x4320, max_instance=8
    - nvidia-49 (GRID P40-4Q) (6 available)
        num_heads=4, frl_config=60, framebuffer=4096M, max_resolution=7680x4320, max_instance=6
    - nvidia-50 (GRID P40-6Q) (4 available)
        num_heads=4, frl_config=60, framebuffer=6144M, max_resolution=7680x4320, max_instance=4
    - nvidia-51 (GRID P40-8Q) (3 available)
        num_heads=4, frl_config=60, framebuffer=8192M, max_resolution=7680x4320, max_instance=3
    - nvidia-52 (GRID P40-12Q) (2 available)
        num_heads=4, frl_config=60, framebuffer=12288M, max_resolution=7680x4320, max_instance=2
    - nvidia-53 (GRID P40-24Q) (1 available)
        num_heads=4, frl_config=60, framebuffer=24576M, max_resolution=7680x4320, max_instance=1
    - nvidia-54 (GRID P40-1A) (24 available)
        num_heads=1, frl_config=60, framebuffer=1024M, max_resolution=1280x1024, max_instance=24
    - nvidia-55 (GRID P40-2A) (12 available)
        num_heads=1, frl_config=60, framebuffer=2048M, max_resolution=1280x1024, max_instance=12
    - nvidia-56 (GRID P40-3A) (8 available)
        num_heads=1, frl_config=60, framebuffer=3072M, max_resolution=1280x1024, max_instance=8
    - nvidia-57 (GRID P40-4A) (6 available)
        num_heads=1, frl_config=60, framebuffer=4096M, max_resolution=1280x1024, max_instance=6
    - nvidia-58 (GRID P40-6A) (4 available)
        num_heads=1, frl_config=60, framebuffer=6144M, max_resolution=1280x1024, max_instance=4
    - nvidia-59 (GRID P40-8A) (3 available)
        num_heads=1, frl_config=60, framebuffer=8192M, max_resolution=1280x1024, max_instance=3
    - nvidia-60 (GRID P40-12A) (2 available)
        num_heads=1, frl_config=60, framebuffer=12288M, max_resolution=1280x1024, max_instance=2
    - nvidia-61 (GRID P40-24A) (1 available)
        num_heads=1, frl_config=60, framebuffer=24576M, max_resolution=1280x1024, max_instance=1
    - nvidia-62 (GRID P40-1B) (24 available)
        num_heads=4, frl_config=45, framebuffer=1024M, max_resolution=5120x2880, max_instance=24

NICs:
  Card 0:
    NUMA node: 0
    Vendor: Realtek Semiconductor Co., Ltd. (10ec)
    Product: RTL8125 2.5GbE Controller (8125)
    PCI address: 0000:03:00.0
    Driver: r8169 (6.3.9-1-liquorix-amd64)
    Ports:
      - Port 0 (ethernet)
        ID: eth0
        Address: a0:36:9f:43:ac:f3
        Supported modes: 10baseT/Half, 10baseT/Full, 100baseT/Half, 100baseT/Full, 1000baseT/Full, 2500baseT/Full
        Supported ports: twisted pair, media-independent
        Port type: twisted pair
        Transceiver type: external
        Auto negotiation: true
        Link detected: true
        Link speed: 100Mbit/s (full duplex)
  Card 1:
    NUMA node: 0
    Vendor: Intel Corporation (8086)
    Product: Ethernet Connection (2) I219-LM (15b7)
    PCI address: 0000:00:1f.6
    Driver: e1000e (6.3.9-1-liquorix-amd64)
    Ports:
      - Port 0 (ethernet)
        ID: eth1
        Address: 3c:52:82:5d:13:44
        Supported modes: 10baseT/Half, 10baseT/Full, 100baseT/Half, 100baseT/Full, 1000baseT/Full
        Supported ports: twisted pair
        Port type: twisted pair
        Transceiver type: internal
        Auto negotiation: true
        Link detected: false
  Card 2:
    NUMA node: 0
    Vendor: Qualcomm Atheros (168c)
    Product: QCA9377 802.11ac Wireless Network Adapter (0042)
    PCI address: 0000:04:00.0
    Driver: ath10k_pci (6.3.9-1-liquorix-amd64)
    Ports:
      - Port 0 (ethernet)
        ID: wlan0
        Address: 8c:c8:4b:f4:57:df
        Auto negotiation: false
        Link detected: false

CPU (x86_64):
  Vendor: GenuineIntel
  Name: Intel(R) Xeon(R) CPU E3-1270 v6 @ 3.80GHz
  Caches:
    - Level 1 (type: Data): 32KiB
    - Level 1 (type: Instruction): 32KiB
    - Level 2 (type: Unified): 256KiB
    - Level 3 (type: Unified): 8MiB
  Cores:
    - Core 0
      Frequency: 4002Mhz
      Threads:
        - 0 (id: 0, online: true, NUMA node: 0)
        - 1 (id: 4, online: true, NUMA node: 0)
    - Core 1
      Frequency: 4000Mhz
      Threads:
        - 0 (id: 1, online: true, NUMA node: 0)
        - 1 (id: 5, online: true, NUMA node: 0)
    - Core 2
      Frequency: 3999Mhz
      Threads:
        - 0 (id: 2, online: true, NUMA node: 0)
        - 1 (id: 6, online: true, NUMA node: 0)
    - Core 3
      Frequency: 4002Mhz
      Threads:
        - 0 (id: 3, online: true, NUMA node: 0)
        - 1 (id: 7, online: true, NUMA node: 0)
  Frequency: 4000Mhz (min: 800Mhz, max: 3801Mhz)

Memory:
  Free: 55.30GiB
  Used: 7.42GiB
  Total: 62.73GiB

Disks:
  Disk 0:
    NUMA node: 0
    ID: nvme0n1
    Device: 259:0
    Model: INTEL SSDPEKKW512G7
    Type: nvme
    Size: 476.94GiB
    WWN: eui.0000000001000000e4d25ce9d3974d01
    Read-Only: false
    Removable: false
    Partitions:
      - Partition 1
        ID: nvme0n1p1
        Device: 259:1
        Read-Only: false
        Size: 22.46GiB
      - Partition 2
        ID: nvme0n1p2
        Device: 259:2
        Read-Only: false
        Size: 5.86GiB
      - Partition 3
        ID: nvme0n1p3
        Device: 259:3
        Read-Only: false
        Size: 500.00MiB
      - Partition 4
        ID: nvme0n1p4
        Device: 259:4
        Read-Only: false
        Size: 1000.00MiB
      - Partition 5
        ID: nvme0n1p5
        Device: 259:5
        Read-Only: false
        Size: 48.83GiB
      - Partition 6
        ID: nvme0n1p6
        Device: 259:6
        Read-Only: false
        Size: 29.30GiB
  Disk 1:
    NUMA node: 0
    ID: nvme1n1
    Device: 259:7
    Model: CL1-3D128-Q11 NVMe SSSTC 128GB
    Type: nvme
    Size: 119.24GiB
    WWN: eui.38f601563154d5a4
    Read-Only: false
    Removable: false
    Partitions:
      - Partition 1
        ID: nvme1n1p1
        Device: 259:8
        Read-Only: false
        Size: 22.46GiB
      - Partition 2
        ID: nvme1n1p2
        Device: 259:9
        Read-Only: false
        Size: 5.86GiB
      - Partition 3
        ID: nvme1n1p3
        Device: 259:10
        Read-Only: false
        Size: 29.30GiB
  Disk 2:
    NUMA node: 0
    ID: sda
    Device: 8:0
    Model: TEAM T2532TB
    Type: sata
    Size: 1.86TiB
    Read-Only: false
    Removable: false
    Partitions:
      - Partition 1
        ID: sda1
        Device: 8:1
        Read-Only: false
        Size: 1.86TiB
  Disk 3:
    NUMA node: 0
    ID: sdb
    Device: 8:16
    Model: ST4000NM0033-9ZM170
    Type: sata
    Size: 3.64TiB
    Read-Only: false
    Removable: false
    Partitions:
      - Partition 1
        ID: sdb1
        Device: 8:17
        Read-Only: false
        Size: 48.83GiB
  Disk 4:
    NUMA node: 0
    ID: sdc
    Device: 8:32
    Model: ST4000NM0033-9ZM170
    Type: sata
    Size: 3.64TiB
    Read-Only: false
    Removable: false
    Partitions:
      - Partition 1
        ID: sdc1
        Device: 8:33
        Read-Only: false
        Size: 48.83GiB
  Disk 5:
    NUMA node: 0
    ID: sdd
    Device: 8:48
    Model: ST4000LM024-2U817V
    Type: sata
    Size: 3.64TiB
    Read-Only: false
    Removable: false
    Partitions:
      - Partition 1
        ID: sdd1
        Device: 8:49
        Read-Only: false
        Size: 500.00MiB
      - Partition 2
        ID: sdd2
        Device: 8:50
        Read-Only: false
        Size: 29.30GiB
      - Partition 3
        ID: sdd3
        Device: 8:51
        Read-Only: false
        Size: 1.80TiB
      - Partition 4
        ID: sdd4
        Device: 8:52
        Read-Only: false
        Size: 29.30GiB

Okay, so looks like the mdev stuff was detected.

Maybe just try something like:

incus create images:ubuntu/22.04 v1 --vm
incus config device add v1 p1000 gpu gputype=mdev mdev=nvidia-156 pci=0000:01:00.0
incus start v1

I forgot! This error is why I started digging into vfio-pci, in the first place…

$ incus create images:ubuntu/22.04 v1 --vm
Creating v1
                                              
The instance you are starting doesn't have any network attached to it.
  To create a new network, use: incus network create
  To attach a network to an instance, use: incus network attach

$ incus config device add v1 p1000 gpu gputype=mdev mdev=nvidia-156 pci=0000:01:00.0
Device p1000 added to v1

$ incus start v1
Error: Failed to run: forklimits limit=memlock:unlimited:unlimited fd=3 fd=4 -- /opt/incus/bin/qemu-system-x86_64 -S -name v1 -uuid 80f0667d-1c70-4ce0-9b93-852df9e80ab4 -daemonize -cpu host,hv_passthrough -nographic -serial chardev:console -nodefaults -no-user-config -sandbox on,obsolete=deny,elevateprivileges=allow,spawn=allow,resourcecontrol=deny -readconfig /run/incus/anvil-project_v1/qemu.conf -spice unix=on,disable-ticketing=on,addr=/run/incus/anvil-project_v1/qemu.spice -pidfile /run/incus/anvil-project_v1/qemu.pid -D /var/log/incus/anvil-project_v1/qemu.log -smbios type=2,manufacturer=LinuxContainers,product=Incus -runas incus: : exit status 1
Try `incus info --show-log v1` for more info

$ incus info --show-log v1
Name: v1
Status: STOPPED
Type: virtual-machine
Architecture: x86_64
Created: 2024/03/05 11:39 MST
Last Used: 1969/12/31 17:00 MST

Log:

qemu-system-x86_64:/run/incus/anvil-project_v1/qemu.conf:265: vfio 87cdc5f8-1013-4e4c-b17e-c9f8d2351660: error getting device from group 15: Input/output error
Verify all devices in group 15 are bound to vfio-<bus> or pci-stub and not already in use

Ah, that’s an odd QEMU error… Particularly unusual to get that on mdev given it’s not really moving a true PCIe device into the VM (like it would with SR-IOV).

Could the network card issues be a clue? I’m not sure how to diagnose it, either. I can see network interfaces in the instance. The wireless card’s interface is usually an eth2 or wlsXX, but neither VM nor container recognize it as a wireless adapter, even after installing appropriate drivers. This reddit post encouraged me to look into vfio-pci more.

Try ensuring all devices within the iommu_group are bound to their vfio bus driver.

I just tried adding options vfio_iommu_type1 allow_unsafe_interrupts=1, which fixed a user’s problem in this thread on the Gentoo forums. I get the same error as above when I try start the VM.

$ sudo dmesg | grep vfio
[    0.000000] Command line: BOOT_IMAGE=/BOOT-VOL/misty-mx-bootfs@/vmlinuz-6.3.9-1-liquorix-amd64 root=ZFS=/ROOT-VOL/misty-mx-rootfs ro root=ZFS=misty-mx-root-pool/ROOT-VOL/misty-mx-rootfs quiet splash rd.driver.pre=vfio-pci intel_iommu=on init=/lib/systemd/systemd
[    0.015791] Kernel command line: audit=0 intel_pstate=disable hpet=disable rcupdate.rcu_expedited=1  BOOT_IMAGE=/BOOT-VOL/misty-mx-bootfs@/vmlinuz-6.3.9-1-liquorix-amd64 root=ZFS=/ROOT-VOL/misty-mx-rootfs ro root=ZFS=misty-mx-root-pool/ROOT-VOL/misty-mx-rootfs quiet splash rd.driver.pre=vfio-pci intel_iommu=on init=/lib/systemd/systemd
[    4.079607] vfio_pci: add [1028:1810[ffffffff:ffffffff]] class 0x000000/00000000
[    4.079611] vfio_pci: add [103c:802e[ffffffff:ffffffff]] class 0x000000/00000000
[    4.079614] vfio_pci: add [10de:11bc[ffffffff:ffffffff]] class 0x000000/00000000
[  136.729047] nvidia-vgpu-vfio 5134f6c1-94fe-4279-a017-f3964846514d: Adding to iommu group 15
[  136.838197] [nvidia-vgpu-vfio] 5134f6c1-94fe-4279-a017-f3964846514d: start failed. status: 0x1 
[  136.858603] nvidia-vgpu-vfio 5134f6c1-94fe-4279-a017-f3964846514d: Removing from iommu group 15