How to pass Nvidia MIG devices using LXC config

Hi there!

I have tried now for several months to get connecting a single MIG device working with LXC, however I have hit a wall where I do not know how to come through. I know that the issue seems to be more related to the nvidia stuff, however there seems to be something specific necessary to get this working with LXC that I am missing.

Firstly, I am not using the standard LXC installation but the customized Proxmox Version (however there are not many differences). LXD is not supported there, so I need to work with the manual LXC configs.

The issue:

When creating a new Container with the NVIDIA_VISIBLE_DEVICES environment variable, I can only submit the all value, as all others will throw a unknown device error. I have pinned down the issue to the following: nvidia-container-cli list does not show mig devices (although it probably should as for example in this post in the nvidia forum):

/dev/nvidiactl
/dev/nvidia-uvm
/dev/nvidia-uvm-tools
/dev/nvidia-modeset
/dev/nvidia0
/usr/lib/nvidia/current/nvidia-smi
/usr/lib/nvidia/current/nvidia-debugdump
/usr/bin/nvidia-persistenced
/usr/bin/nv-fabricmanager
/usr/bin/nvidia-cuda-mps-control
/usr/bin/nvidia-cuda-mps-server
/usr/lib/x86_64-linux-gnu/nvidia/current/libnvidia-ml.so.530.30.02
/usr/lib/x86_64-linux-gnu/nvidia/current/libnvidia-cfg.so.530.30.02
/usr/lib/x86_64-linux-gnu/nvidia/current/libcuda.so.530.30.02
/usr/lib/x86_64-linux-gnu/nvidia/current/libnvidia-opencl.so.530.30.02
/usr/lib/x86_64-linux-gnu/nvidia/current/libnvidia-ptxjitcompiler.so.530.30.02
/usr/lib/x86_64-linux-gnu/libnvidia-compiler.so.530.30.02
/usr/lib/x86_64-linux-gnu/nvidia/current/libnvidia-nvvm.so.530.30.02
/usr/lib/x86_64-linux-gnu/libnvidia-ngx.so.530.30.02
/usr/lib/x86_64-linux-gnu/nvidia/current/libnvcuvid.so.530.30.02
/usr/lib/x86_64-linux-gnu/libnvidia-eglcore.so.530.30.02
/usr/lib/x86_64-linux-gnu/libnvidia-glcore.so.530.30.02
/usr/lib/x86_64-linux-gnu/libnvidia-tls.so.530.30.02
/usr/lib/x86_64-linux-gnu/libnvidia-glsi.so.530.30.02
/usr/lib/x86_64-linux-gnu/libnvidia-rtcore.so.530.30.02
/usr/lib/x86_64-linux-gnu/nvidia/current/libGLX_nvidia.so.530.30.02
/usr/lib/x86_64-linux-gnu/nvidia/current/libEGL_nvidia.so.530.30.02
/usr/lib/x86_64-linux-gnu/nvidia/current/libGLESv2_nvidia.so.530.30.02
/usr/lib/x86_64-linux-gnu/nvidia/current/libGLESv1_CM_nvidia.so.530.30.02
/usr/lib/x86_64-linux-gnu/libnvidia-glvkspirv.so.530.30.02
/run/nvidia-persistenced/socket
/lib/firmware/nvidia/530.30.02/gsp_ga10x.bin
/lib/firmware/nvidia/530.30.02/gsp_tu10x.bin

Whereas nvidia-smi -L shows this:

GPU 0: NVIDIA A100 80GB PCIe (UUID: GPU-595a802e-9268-e1f8-cad5-9a69202e4cd5)
  MIG 2g.20gb     Device  0: (UUID: MIG-5a62f918-fd78-5a32-9614-008a1471bf61)
  MIG 1g.20gb     Device  1: (UUID: MIG-4759842f-f141-5b89-8f18-a8fc14133926)
  MIG 1g.10gb     Device  2: (UUID: MIG-c28f1ba1-5c54-5f04-a3ce-8b4eabb6d542)
  MIG 1g.10gb     Device  3: (UUID: MIG-566c026a-0826-5fd6-847a-b32e59131bdb)
  MIG 1g.10gb     Device  4: (UUID: MIG-c64a168b-dba3-5bc5-b5d4-0be7af011017)
  MIG 1g.10gb     Device  5: (UUID: MIG-ea6e7ab3-3a7e-5a6b-a133-2ed92d29bd97)

The command used to apparently find the mig devices in LXD is nvidia-container-cli list --csv also only shows the main GPU:

NVRM version,CUDA version
530.30.02,12.1

Device Index,Device Minor,Model,Brand,GPU UUID,Bus Location,Architecture
0,0,NVIDIA A100 80GB PCIe,Nvidia,GPU-595a802e-9268-e1f8-cad5-9a69202e4cd5,00000000:ca:00.0,8.0

Funnily however, the nvidia-ctk cdi generate command outputs all mig devices correctly.

I believe I have missed something in the setup of the nvidia-container-toolkit or libnvidia-container, that prohibits it from accessing the mig devices.

As a side note I have noticed that nvidia-container-cli is changing the user, so I have tried passing root there, but that did not change anything.

It would be really helpful if somebody here has an Idea of what I am missing here!

Additionally some (maybe) helpful logs:

Versions:

root@gpu1:~# uname -a
Linux gpu1 5.15.107-2-pve #1 SMP PVE 5.15.107-2 (2023-05-10T09:10Z) x86_64 GNU/Linux
root@gpu1:~# apt list --installed | grep nvidia
glx-alternative-nvidia/stable,now 1.2.1~deb11u1 amd64 [installed,automatic]
libegl-nvidia0/unknown,now 530.30.02-1 amd64 [installed,automatic]
libgl1-nvidia-glvnd-glx/unknown,now 530.30.02-1 amd64 [installed,automatic]
libgles-nvidia1/unknown,now 530.30.02-1 amd64 [installed,automatic]
libgles-nvidia2/unknown,now 530.30.02-1 amd64 [installed,automatic]
libglx-nvidia0/unknown,now 530.30.02-1 amd64 [installed,automatic]
libnvidia-cfg1/unknown,now 530.30.02-1 amd64 [installed,automatic]
libnvidia-compiler/unknown,now 530.30.02-1 amd64 [installed,automatic]
libnvidia-container-tools/buster,now 1.13.1-1 amd64 [installed]
libnvidia-container1/buster,now 1.13.1-1 amd64 [installed]
libnvidia-eglcore/unknown,now 530.30.02-1 amd64 [installed,automatic]
libnvidia-glcore/unknown,now 530.30.02-1 amd64 [installed,automatic]
libnvidia-glvkspirv/unknown,now 530.30.02-1 amd64 [installed,automatic]
libnvidia-ml-dev/stable,now 11.2.152~11.2.2-3+deb11u3 amd64 [installed,automatic]
libnvidia-ml1/unknown,now 530.30.02-1 amd64 [installed,automatic]
libnvidia-nvvm4/unknown,now 530.30.02-1 amd64 [installed,automatic]
libnvidia-ptxjitcompiler1/unknown,now 530.30.02-1 amd64 [installed,automatic]
libnvidia-rtcore/unknown,now 530.30.02-1 amd64 [installed,automatic]
libnvidia-wayland-client/unknown,now 530.30.02-1 amd64 [installed,automatic]
nvidia-alternative/unknown,now 530.30.02-1 amd64 [installed,automatic]
nvidia-container-toolkit-base/buster,now 1.13.1-1 amd64 [installed,automatic]
nvidia-container-toolkit/buster,now 1.13.1-1 amd64 [installed]
nvidia-cuda-dev/stable,now 11.2.2-3+deb11u3 amd64 [installed,automatic]
nvidia-cuda-gdb/stable,now 11.2.152~11.2.2-3+deb11u3 amd64 [installed,automatic]
nvidia-cuda-mps/unknown,now 530.30.02-1 amd64 [installed,automatic]
nvidia-cuda-toolkit-doc/stable,now 11.2.2-3+deb11u3 all [installed,automatic]
nvidia-cuda-toolkit/stable,now 11.2.2-3+deb11u3 amd64 [installed]
nvidia-driver-bin/unknown,now 530.30.02-1 amd64 [installed,automatic]
nvidia-driver-libs/unknown,now 530.30.02-1 amd64 [installed,automatic]
nvidia-driver/unknown,now 530.30.02-1 amd64 [installed,automatic]
nvidia-egl-common/unknown,now 530.30.02-1 amd64 [installed,automatic]
nvidia-egl-icd/unknown,now 530.30.02-1 amd64 [installed,automatic]
nvidia-fabricmanager-530/unknown,now 530.30.02-1 amd64 [installed]
nvidia-installer-cleanup/stable,now 20151021+13 amd64 [installed,automatic]
nvidia-kernel-common/stable,now 20151021+13 amd64 [installed,automatic]
nvidia-kernel-dkms/unknown,now 530.30.02-1 amd64 [installed,automatic]
nvidia-kernel-support/unknown,now 530.30.02-1 amd64 [installed,automatic]
nvidia-legacy-check/unknown,now 530.30.02-1 amd64 [installed,automatic]
nvidia-modprobe/unknown,now 530.30.02-1 amd64 [installed,automatic]
nvidia-opencl-common/unknown,now 530.30.02-1 amd64 [installed,automatic]
nvidia-opencl-dev/stable,now 11.2.2-3+deb11u3 amd64 [installed,automatic]
nvidia-opencl-icd/unknown,now 530.30.02-1 amd64 [installed,automatic]
nvidia-openjdk-8-jre/stable,now 9.+8u332-ga-1~~deb9u1~11.2.2-3+deb11u3 amd64 [installed,automatic]
nvidia-persistenced/unknown,now 530.30.02-1 amd64 [installed,automatic]
nvidia-profiler/stable,now 11.2.152~11.2.2-3+deb11u3 amd64 [installed,automatic]
nvidia-settings/unknown,now 530.30.02-1 amd64 [installed,automatic]
nvidia-smi/unknown,now 530.30.02-1 amd64 [installed,automatic]
nvidia-support/stable,now 20151021+13 amd64 [installed,automatic]
nvidia-vdpau-driver/unknown,now 530.30.02-1 amd64 [installed,automatic]
nvidia-visual-profiler/stable,now 11.2.152~11.2.2-3+deb11u3 amd64 [installed,automatic]
nvidia-vulkan-common/unknown,now 530.30.02-1 amd64 [installed,automatic]
nvidia-vulkan-icd/unknown,now 530.30.02-1 amd64 [installed,automatic]
xserver-xorg-video-nvidia/unknown,now 530.30.02-1 amd64 [installed,automatic]
root@gpu1:~# apt list --installed | grep lxc
lxc-pve/stable,now 5.0.2-2 amd64 [installed]
lxcfs/stable,now 5.0.3-pve1 amd64 [installed]
pve-lxc-syscalld/stable,now 1.2.2-1 amd64 [installed]

Log of starting container:

INFO     confile - ../src/lxc/confile.c:set_config_idmaps:2273 - Read uid map: type u nsid 0 hostid 100000 range 65536
INFO     confile - ../src/lxc/confile.c:set_config_idmaps:2273 - Read uid map: type g nsid 0 hostid 100000 range 65536
INFO     lsm - ../src/lxc/lsm/lsm.c:lsm_init_static:38 - Initialized LSM security driver AppArmor
INFO     conf - ../src/lxc/conf.c:run_script_argv:338 - Executing script "/usr/share/lxc/hooks/lxc-pve-prestart-hook" for container "103", config section "lxc"
INFO     cgfsng - ../src/lxc/cgroups/cgfsng.c:unpriv_systemd_create_scope:1227 - Running privileged, not using a systemd unit
DEBUG    seccomp - ../src/lxc/seccomp.c:parse_config_v2:656 - Host native arch is [3221225534]
INFO     seccomp - ../src/lxc/seccomp.c:parse_config_v2:807 - Processing "reject_force_umount  # comment this to allow umount -f;  not recommended"
INFO     seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:524 - Set seccomp rule to reject force umounts
INFO     seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:524 - Set seccomp rule to reject force umounts
INFO     seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:524 - Set seccomp rule to reject force umounts
INFO     seccomp - ../src/lxc/seccomp.c:parse_config_v2:807 - Processing "[all]"
INFO     seccomp - ../src/lxc/seccomp.c:parse_config_v2:807 - Processing "kexec_load errno 1"
INFO     seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:564 - Adding native rule for syscall[246:kexec_load] action[327681:errno] arch[0]
INFO     seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:564 - Adding compat rule for syscall[246:kexec_load] action[327681:errno] arch[1073741827]
INFO     seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:564 - Adding compat rule for syscall[246:kexec_load] action[327681:errno] arch[1073741886]
INFO     seccomp - ../src/lxc/seccomp.c:parse_config_v2:807 - Processing "open_by_handle_at errno 1"
INFO     seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:564 - Adding native rule for syscall[304:open_by_handle_at] action[327681:errno] arch[0]
INFO     seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:564 - Adding compat rule for syscall[304:open_by_handle_at] action[327681:errno] arch[1073741827]
INFO     seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:564 - Adding compat rule for syscall[304:open_by_handle_at] action[327681:errno] arch[1073741886]
INFO     seccomp - ../src/lxc/seccomp.c:parse_config_v2:807 - Processing "init_module errno 1"
INFO     seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:564 - Adding native rule for syscall[175:init_module] action[327681:errno] arch[0]
INFO     seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:564 - Adding compat rule for syscall[175:init_module] action[327681:errno] arch[1073741827]
INFO     seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:564 - Adding compat rule for syscall[175:init_module] action[327681:errno] arch[1073741886]
INFO     seccomp - ../src/lxc/seccomp.c:parse_config_v2:807 - Processing "finit_module errno 1"
INFO     seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:564 - Adding native rule for syscall[313:finit_module] action[327681:errno] arch[0]
INFO     seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:564 - Adding compat rule for syscall[313:finit_module] action[327681:errno] arch[1073741827]
INFO     seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:564 - Adding compat rule for syscall[313:finit_module] action[327681:errno] arch[1073741886]
INFO     seccomp - ../src/lxc/seccomp.c:parse_config_v2:807 - Processing "delete_module errno 1"
INFO     seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:564 - Adding native rule for syscall[176:delete_module] action[327681:errno] arch[0]
INFO     seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:564 - Adding compat rule for syscall[176:delete_module] action[327681:errno] arch[1073741827]
INFO     seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:564 - Adding compat rule for syscall[176:delete_module] action[327681:errno] arch[1073741886]
INFO     seccomp - ../src/lxc/seccomp.c:parse_config_v2:807 - Processing "ioctl errno 1 [1,0x9400,SCMP_CMP_MASKED_EQ,0xff00]"
INFO     seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:547 - arg_cmp[0]: SCMP_CMP(1, 7, 65280, 37888)
INFO     seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:564 - Adding native rule for syscall[16:ioctl] action[327681:errno] arch[0]
INFO     seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:547 - arg_cmp[0]: SCMP_CMP(1, 7, 65280, 37888)
INFO     seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:564 - Adding compat rule for syscall[16:ioctl] action[327681:errno] arch[1073741827]
INFO     seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:547 - arg_cmp[0]: SCMP_CMP(1, 7, 65280, 37888)
INFO     seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:564 - Adding compat rule for syscall[16:ioctl] action[327681:errno] arch[1073741886]
INFO     seccomp - ../src/lxc/seccomp.c:parse_config_v2:807 - Processing "keyctl errno 38"
INFO     seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:564 - Adding native rule for syscall[250:keyctl] action[327718:errno] arch[0]
INFO     seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:564 - Adding compat rule for syscall[250:keyctl] action[327718:errno] arch[1073741827]
INFO     seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:564 - Adding compat rule for syscall[250:keyctl] action[327718:errno] arch[1073741886]
INFO     seccomp - ../src/lxc/seccomp.c:parse_config_v2:1017 - Merging compat seccomp contexts into main context
INFO     start - ../src/lxc/start.c:lxc_init:881 - Container "103" is initialized
INFO     cgfsng - ../src/lxc/cgroups/cgfsng.c:cgfsng_monitor_create:1391 - The monitor process uses "lxc.monitor/103" as cgroup
DEBUG    storage - ../src/lxc/storage/storage.c:storage_query:231 - Detected rootfs type "dir"
DEBUG    storage - ../src/lxc/storage/storage.c:storage_query:231 - Detected rootfs type "dir"
INFO     cgfsng - ../src/lxc/cgroups/cgfsng.c:cgfsng_payload_create:1499 - The container process uses "lxc/103/ns" as inner and "lxc/103" as limit cgroup
INFO     start - ../src/lxc/start.c:lxc_spawn:1762 - Cloned CLONE_NEWUSER
INFO     start - ../src/lxc/start.c:lxc_spawn:1762 - Cloned CLONE_NEWNS
INFO     start - ../src/lxc/start.c:lxc_spawn:1762 - Cloned CLONE_NEWPID
INFO     start - ../src/lxc/start.c:lxc_spawn:1762 - Cloned CLONE_NEWUTS
INFO     start - ../src/lxc/start.c:lxc_spawn:1762 - Cloned CLONE_NEWIPC
INFO     start - ../src/lxc/start.c:lxc_spawn:1762 - Cloned CLONE_NEWCGROUP
DEBUG    start - ../src/lxc/start.c:lxc_try_preserve_namespace:139 - Preserved user namespace via fd 17 and stashed path as user:/proc/34873/fd/17
DEBUG    start - ../src/lxc/start.c:lxc_try_preserve_namespace:139 - Preserved mnt namespace via fd 18 and stashed path as mnt:/proc/34873/fd/18
DEBUG    start - ../src/lxc/start.c:lxc_try_preserve_namespace:139 - Preserved pid namespace via fd 19 and stashed path as pid:/proc/34873/fd/19
DEBUG    start - ../src/lxc/start.c:lxc_try_preserve_namespace:139 - Preserved uts namespace via fd 20 and stashed path as uts:/proc/34873/fd/20
DEBUG    start - ../src/lxc/start.c:lxc_try_preserve_namespace:139 - Preserved ipc namespace via fd 21 and stashed path as ipc:/proc/34873/fd/21
DEBUG    start - ../src/lxc/start.c:lxc_try_preserve_namespace:139 - Preserved cgroup namespace via fd 22 and stashed path as cgroup:/proc/34873/fd/22
DEBUG    conf - ../src/lxc/conf.c:idmaptool_on_path_and_privileged:3549 - The binary "/usr/bin/newuidmap" does have the setuid bit set
DEBUG    conf - ../src/lxc/conf.c:idmaptool_on_path_and_privileged:3549 - The binary "/usr/bin/newgidmap" does have the setuid bit set
DEBUG    conf - ../src/lxc/conf.c:lxc_map_ids:3634 - Functional newuidmap and newgidmap binary found
INFO     cgfsng - ../src/lxc/cgroups/cgfsng.c:cgfsng_setup_limits:3251 - Limits for the unified cgroup hierarchy have been setup
DEBUG    conf - ../src/lxc/conf.c:idmaptool_on_path_and_privileged:3549 - The binary "/usr/bin/newuidmap" does have the setuid bit set
DEBUG    conf - ../src/lxc/conf.c:idmaptool_on_path_and_privileged:3549 - The binary "/usr/bin/newgidmap" does have the setuid bit set
INFO     conf - ../src/lxc/conf.c:lxc_map_ids:3632 - Caller maps host root. Writing mapping directly
NOTICE   utils - ../src/lxc/utils.c:lxc_drop_groups:1367 - Dropped supplimentary groups
INFO     start - ../src/lxc/start.c:do_start:1104 - Unshared CLONE_NEWNET
NOTICE   utils - ../src/lxc/utils.c:lxc_drop_groups:1367 - Dropped supplimentary groups
NOTICE   utils - ../src/lxc/utils.c:lxc_switch_uid_gid:1343 - Switched to gid 0
NOTICE   utils - ../src/lxc/utils.c:lxc_switch_uid_gid:1352 - Switched to uid 0
DEBUG    start - ../src/lxc/start.c:lxc_try_preserve_namespace:139 - Preserved net namespace via fd 5 and stashed path as net:/proc/34873/fd/5
INFO     conf - ../src/lxc/conf.c:run_script_argv:338 - Executing script "/usr/share/lxc/lxcnetaddbr" for container "103", config section "net"
DEBUG    network - ../src/lxc/network.c:netdev_configure_server_veth:852 - Instantiated veth tunnel "veth103i0 <--> vethLRq7Ho"
DEBUG    conf - ../src/lxc/conf.c:lxc_mount_rootfs:1437 - Mounted rootfs "/var/lib/lxc/103/rootfs" onto "/usr/lib/x86_64-linux-gnu/lxc/rootfs" with options "(null)"
INFO     conf - ../src/lxc/conf.c:setup_utsname:876 - Set hostname to "gpu-test"
DEBUG    network - ../src/lxc/network.c:setup_hw_addr:3821 - Mac address "BE:E3:B2:A5:1C:78" on "eth0" has been setup
DEBUG    network - ../src/lxc/network.c:lxc_network_setup_in_child_namespaces_common:3962 - Network device "eth0" has been setup
INFO     network - ../src/lxc/network.c:lxc_setup_network_in_child_namespaces:4019 - Finished setting up network devices with caller assigned names
INFO     conf - ../src/lxc/conf.c:mount_autodev:1220 - Preparing "/dev"
INFO     conf - ../src/lxc/conf.c:mount_autodev:1281 - Prepared "/dev"
DEBUG    conf - ../src/lxc/conf.c:lxc_mount_auto_mounts:736 - Invalid argument - Tried to ensure procfs is unmounted
DEBUG    conf - ../src/lxc/conf.c:lxc_mount_auto_mounts:759 - Invalid argument - Tried to ensure sysfs is unmounted
DEBUG    conf - ../src/lxc/conf.c:mount_entry:2445 - Remounting "/sys/fs/fuse/connections" on "/usr/lib/x86_64-linux-gnu/lxc/rootfs/sys/fs/fuse/connections" to respect bind or remount options
DEBUG    conf - ../src/lxc/conf.c:mount_entry:2464 - Flags for "/sys/fs/fuse/connections" were 4110, required extra flags are 14
DEBUG    conf - ../src/lxc/conf.c:mount_entry:2508 - Mounted "/sys/fs/fuse/connections" on "/usr/lib/x86_64-linux-gnu/lxc/rootfs/sys/fs/fuse/connections" with filesystem type "none"
DEBUG    conf - ../src/lxc/conf.c:mount_entry:2445 - Remounting "/sys/kernel/debug" on "/usr/lib/x86_64-linux-gnu/lxc/rootfs/sys/kernel/debug" to respect bind or remount options
DEBUG    conf - ../src/lxc/conf.c:mount_entry:2464 - Flags for "/sys/kernel/debug" were 4110, required extra flags are 14
DEBUG    conf - ../src/lxc/conf.c:mount_entry:2508 - Mounted "/sys/kernel/debug" on "/usr/lib/x86_64-linux-gnu/lxc/rootfs/sys/kernel/debug" with filesystem type "none"
DEBUG    conf - ../src/lxc/conf.c:mount_entry:2445 - Remounting "/sys/kernel/security" on "/usr/lib/x86_64-linux-gnu/lxc/rootfs/sys/kernel/security" to respect bind or remount options
DEBUG    conf - ../src/lxc/conf.c:mount_entry:2464 - Flags for "/sys/kernel/security" were 4110, required extra flags are 14
DEBUG    conf - ../src/lxc/conf.c:mount_entry:2508 - Mounted "/sys/kernel/security" on "/usr/lib/x86_64-linux-gnu/lxc/rootfs/sys/kernel/security" with filesystem type "none"
DEBUG    conf - ../src/lxc/conf.c:mount_entry:2445 - Remounting "/sys/fs/pstore" on "/usr/lib/x86_64-linux-gnu/lxc/rootfs/sys/fs/pstore" to respect bind or remount options
DEBUG    conf - ../src/lxc/conf.c:mount_entry:2464 - Flags for "/sys/fs/pstore" were 4110, required extra flags are 14
DEBUG    conf - ../src/lxc/conf.c:mount_entry:2508 - Mounted "/sys/fs/pstore" on "/usr/lib/x86_64-linux-gnu/lxc/rootfs/sys/fs/pstore" with filesystem type "none"
DEBUG    conf - ../src/lxc/conf.c:mount_entry:2508 - Mounted "mqueue" on "/usr/lib/x86_64-linux-gnu/lxc/rootfs/dev/mqueue" with filesystem type "mqueue"
DEBUG    conf - ../src/lxc/conf.c:mount_entry:2445 - Remounting "/sys/firmware/efi/efivars" on "/usr/lib/x86_64-linux-gnu/lxc/rootfs/sys/firmware/efi/efivars" to respect bind or remount options
DEBUG    conf - ../src/lxc/conf.c:mount_entry:2464 - Flags for "/sys/firmware/efi/efivars" were 4110, required extra flags are 14
DEBUG    conf - ../src/lxc/conf.c:mount_entry:2508 - Mounted "/sys/firmware/efi/efivars" on "/usr/lib/x86_64-linux-gnu/lxc/rootfs/sys/firmware/efi/efivars" with filesystem type "none"
DEBUG    conf - ../src/lxc/conf.c:mount_entry:2445 - Remounting "/proc/sys/fs/binfmt_misc" on "/usr/lib/x86_64-linux-gnu/lxc/rootfs/proc/sys/fs/binfmt_misc" to respect bind or remount options
DEBUG    conf - ../src/lxc/conf.c:mount_entry:2464 - Flags for "/proc/sys/fs/binfmt_misc" were 4110, required extra flags are 14
DEBUG    conf - ../src/lxc/conf.c:mount_entry:2508 - Mounted "/proc/sys/fs/binfmt_misc" on "/usr/lib/x86_64-linux-gnu/lxc/rootfs/proc/sys/fs/binfmt_misc" with filesystem type "none"
DEBUG    conf - ../src/lxc/conf.c:mount_entry:2508 - Mounted "proc" on "/usr/lib/x86_64-linux-gnu/lxc/rootfs/dev/.lxc/proc" with filesystem type "proc"
DEBUG    conf - ../src/lxc/conf.c:mount_entry:2508 - Mounted "sys" on "/usr/lib/x86_64-linux-gnu/lxc/rootfs/dev/.lxc/sys" with filesystem type "sysfs"
DEBUG    cgfsng - ../src/lxc/cgroups/cgfsng.c:__cgroupfs_mount:1909 - Mounted cgroup filesystem cgroup2 onto 19((null))
INFO     conf - ../src/lxc/conf.c:run_script_argv:338 - Executing script "/usr/share/lxc/hooks/nvidia" for container "103", config section "lxc"
DEBUG    conf - ../src/lxc/conf.c:run_buffer:311 - Script exec /usr/share/lxc/hooks/nvidia 103 lxc mount produced output: INFO: Writing nvidia-container-cli log at /var/lib/lxc/103/rootfs/nvidia.log.

DEBUG    conf - ../src/lxc/conf.c:run_buffer:311 - Script exec /usr/share/lxc/hooks/nvidia 103 lxc mount produced output: + exec nvidia-container-cli --debug=/var/lib/lxc/103/rootfs/nvidia.log --user configure --no-cgroups --ldconfig=@/usr/sbin/ldconfig --device=all --compute --utility /usr/lib/x86_64-linux-gnu/lxc/rootfs

INFO     conf - ../src/lxc/conf.c:run_script_argv:338 - Executing script "/usr/share/lxcfs/lxc.mount.hook" for container "103", config section "lxc"
INFO     conf - ../src/lxc/conf.c:run_script_argv:338 - Executing script "/usr/share/lxc/hooks/lxc-pve-autodev-hook" for container "103", config section "lxc"
INFO     conf - ../src/lxc/conf.c:lxc_fill_autodev:1318 - Populating "/dev"
DEBUG    conf - ../src/lxc/conf.c:lxc_fill_autodev:1402 - Bind mounted host device 16(dev/full) to 18(full)
DEBUG    conf - ../src/lxc/conf.c:lxc_fill_autodev:1402 - Bind mounted host device 16(dev/null) to 18(null)
DEBUG    conf - ../src/lxc/conf.c:lxc_fill_autodev:1402 - Bind mounted host device 16(dev/random) to 18(random)
DEBUG    conf - ../src/lxc/conf.c:lxc_fill_autodev:1402 - Bind mounted host device 16(dev/tty) to 18(tty)
DEBUG    conf - ../src/lxc/conf.c:lxc_fill_autodev:1402 - Bind mounted host device 16(dev/urandom) to 18(urandom)
DEBUG    conf - ../src/lxc/conf.c:lxc_fill_autodev:1402 - Bind mounted host device 16(dev/zero) to 18(zero)
INFO     conf - ../src/lxc/conf.c:lxc_fill_autodev:1406 - Populated "/dev"
INFO     conf - ../src/lxc/conf.c:lxc_transient_proc:3804 - Caller's PID is 1; /proc/self points to 1
DEBUG    conf - ../src/lxc/conf.c:lxc_setup_devpts_child:1780 - Attached detached devpts mount 20 to 18/pts
DEBUG    conf - ../src/lxc/conf.c:lxc_setup_devpts_child:1866 - Created "/dev/ptmx" file as bind mount target
DEBUG    conf - ../src/lxc/conf.c:lxc_setup_devpts_child:1873 - Bind mounted "/dev/pts/ptmx" to "/dev/ptmx"
DEBUG    conf - ../src/lxc/conf.c:lxc_allocate_ttys:1105 - Created tty with ptx fd 22 and pty fd 23 and index 1
DEBUG    conf - ../src/lxc/conf.c:lxc_allocate_ttys:1105 - Created tty with ptx fd 24 and pty fd 25 and index 2
INFO     conf - ../src/lxc/conf.c:lxc_allocate_ttys:1110 - Finished creating 2 tty devices
DEBUG    conf - ../src/lxc/conf.c:lxc_setup_ttys:1066 - Bind mounted "pts/1" onto "tty1"
DEBUG    conf - ../src/lxc/conf.c:lxc_setup_ttys:1066 - Bind mounted "pts/2" onto "tty2"
INFO     conf - ../src/lxc/conf.c:lxc_setup_ttys:1073 - Finished setting up 2 /dev/tty<N> device(s)
INFO     conf - ../src/lxc/conf.c:setup_personality:1946 - Set personality to "0lx0"
DEBUG    conf - ../src/lxc/conf.c:capabilities_deny:3232 - Capabilities have been setup
NOTICE   conf - ../src/lxc/conf.c:lxc_setup:4511 - The container "103" is set up
INFO     apparmor - ../src/lxc/lsm/apparmor.c:apparmor_process_label_set_at:1189 - Set AppArmor label to "lxc-103_</var/lib/lxc>//&:lxc-103_<-var-lib-lxc>:"
INFO     apparmor - ../src/lxc/lsm/apparmor.c:apparmor_process_label_set:1234 - Changed AppArmor profile to lxc-103_</var/lib/lxc>//&:lxc-103_<-var-lib-lxc>:
DEBUG    terminal - ../src/lxc/terminal.c:lxc_terminal_peer_default:696 - No such device - The process does not have a controlling terminal
NOTICE   start - ../src/lxc/start.c:start:2194 - Exec'ing "/sbin/init"
NOTICE   start - ../src/lxc/start.c:post_start:2205 - Started "/sbin/init" with pid "34893"
NOTICE   start - ../src/lxc/start.c:signal_handler:446 - Received 17 from pid 34889 instead of container init 34893
TASK OK

nvidia.log of starting container:

-- WARNING, the following logs are for debugging purposes only --

I0515 14:37:36.813860 2 nvc.c:376] initializing library context (version=1.13.1, build=6f4aea0fca16aaff01bab2567adb34ec30847a0e)
I0515 14:37:36.813897 2 nvc.c:350] using root /
I0515 14:37:36.813901 2 nvc.c:351] using ldcache /etc/ld.so.cache
I0515 14:37:36.813905 2 nvc.c:352] using unprivileged user 0:0
I0515 14:37:36.813922 2 nvc.c:393] attempting to load dxcore to see if we are running under Windows Subsystem for Linux (WSL)
I0515 14:37:36.814059 2 nvc.c:395] dxcore initialization failed, continuing assuming a non-WSL environment
I0515 14:37:36.814211 18 rpc.c:71] starting driver rpc service
I0515 14:37:36.821351 19 rpc.c:71] starting nvcgo rpc service
I0515 14:37:36.822185 2 nvc_container.c:240] configuring container with 'no-cgroups compute utility standalone'
I0515 14:37:36.822428 2 nvc_container.c:262] setting pid to 1
I0515 14:37:36.822435 2 nvc_container.c:263] setting rootfs to /usr/lib/x86_64-linux-gnu/lxc/rootfs
I0515 14:37:36.822439 2 nvc_container.c:264] setting owner to 0:0
I0515 14:37:36.822442 2 nvc_container.c:265] setting bins directory to /usr/bin
I0515 14:37:36.822446 2 nvc_container.c:266] setting libs directory to /usr/lib/x86_64-linux-gnu
I0515 14:37:36.822450 2 nvc_container.c:267] setting libs32 directory to /usr/lib/i386-linux-gnu
I0515 14:37:36.822454 2 nvc_container.c:268] setting cudart directory to /usr/local/cuda
I0515 14:37:36.822457 2 nvc_container.c:269] setting ldconfig to @/usr/sbin/ldconfig (host relative)
I0515 14:37:36.822461 2 nvc_container.c:270] setting mount namespace to /usr/lib/x86_64-linux-gnu/lxc/rootfs/proc/1/ns/mnt
I0515 14:37:36.822468 2 nvc_info.c:796] requesting driver information with ''
I0515 14:37:36.823690 2 nvc_info.c:174] selecting /usr/lib/x86_64-linux-gnu/libnvidia-tls.so.530.30.02
I0515 14:37:36.823726 2 nvc_info.c:174] selecting /usr/lib/x86_64-linux-gnu/libnvidia-rtcore.so.530.30.02
I0515 14:37:36.823772 2 nvc_info.c:174] selecting /usr/lib/x86_64-linux-gnu/nvidia/current/libnvidia-ptxjitcompiler.so.530.30.02
I0515 14:37:36.823813 2 nvc_info.c:174] selecting /usr/lib/x86_64-linux-gnu/nvidia/current/libnvidia-opencl.so.530.30.02
I0515 14:37:36.823854 2 nvc_info.c:174] selecting /usr/lib/x86_64-linux-gnu/nvidia/current/libnvidia-nvvm.so.530.30.02
I0515 14:37:36.823911 2 nvc_info.c:174] selecting /usr/lib/x86_64-linux-gnu/libnvidia-ngx.so.530.30.02
I0515 14:37:36.823955 2 nvc_info.c:174] selecting /usr/lib/x86_64-linux-gnu/nvidia/current/libnvidia-ml.so.530.30.02
I0515 14:37:36.823981 2 nvc_info.c:174] selecting /usr/lib/x86_64-linux-gnu/libnvidia-glvkspirv.so.530.30.02
I0515 14:37:36.824006 2 nvc_info.c:174] selecting /usr/lib/x86_64-linux-gnu/libnvidia-glsi.so.530.30.02
I0515 14:37:36.824042 2 nvc_info.c:174] selecting /usr/lib/x86_64-linux-gnu/libnvidia-glcore.so.530.30.02
I0515 14:37:36.824070 2 nvc_info.c:174] selecting /usr/lib/x86_64-linux-gnu/libnvidia-eglcore.so.530.30.02
I0515 14:37:36.824097 2 nvc_info.c:174] selecting /usr/lib/x86_64-linux-gnu/libnvidia-compiler.so.530.30.02
I0515 14:37:36.824153 2 nvc_info.c:174] selecting /usr/lib/x86_64-linux-gnu/nvidia/current/libnvidia-cfg.so.530.30.02
I0515 14:37:36.824200 2 nvc_info.c:174] selecting /usr/lib/x86_64-linux-gnu/nvidia/current/libnvcuvid.so.530.30.02
I0515 14:37:36.824368 2 nvc_info.c:174] selecting /usr/lib/x86_64-linux-gnu/nvidia/current/libcuda.so.530.30.02
I0515 14:37:36.824477 2 nvc_info.c:174] selecting /usr/lib/x86_64-linux-gnu/nvidia/current/libGLX_nvidia.so.530.30.02
I0515 14:37:36.824522 2 nvc_info.c:174] selecting /usr/lib/x86_64-linux-gnu/nvidia/current/libGLESv2_nvidia.so.530.30.02
I0515 14:37:36.824566 2 nvc_info.c:174] selecting /usr/lib/x86_64-linux-gnu/nvidia/current/libGLESv1_CM_nvidia.so.530.30.02
I0515 14:37:36.824606 2 nvc_info.c:174] selecting /usr/lib/x86_64-linux-gnu/nvidia/current/libEGL_nvidia.so.530.30.02
W0515 14:37:36.824631 2 nvc_info.c:400] missing library libnvidia-nscq.so
W0515 14:37:36.824636 2 nvc_info.c:400] missing library libcudadebugger.so
W0515 14:37:36.824640 2 nvc_info.c:400] missing library libnvidia-fatbinaryloader.so
W0515 14:37:36.824643 2 nvc_info.c:400] missing library libnvidia-allocator.so
W0515 14:37:36.824647 2 nvc_info.c:400] missing library libnvidia-pkcs11.so
W0515 14:37:36.824651 2 nvc_info.c:400] missing library libvdpau_nvidia.so
W0515 14:37:36.824654 2 nvc_info.c:400] missing library libnvidia-encode.so
W0515 14:37:36.824658 2 nvc_info.c:400] missing library libnvidia-opticalflow.so
W0515 14:37:36.824662 2 nvc_info.c:400] missing library libnvidia-fbc.so
W0515 14:37:36.824665 2 nvc_info.c:400] missing library libnvidia-ifr.so
W0515 14:37:36.824669 2 nvc_info.c:400] missing library libnvoptix.so
W0515 14:37:36.824672 2 nvc_info.c:400] missing library libnvidia-cbl.so
W0515 14:37:36.824676 2 nvc_info.c:404] missing compat32 library libnvidia-ml.so
W0515 14:37:36.824680 2 nvc_info.c:404] missing compat32 library libnvidia-cfg.so
W0515 14:37:36.824684 2 nvc_info.c:404] missing compat32 library libnvidia-nscq.so
W0515 14:37:36.824687 2 nvc_info.c:404] missing compat32 library libcuda.so
W0515 14:37:36.824691 2 nvc_info.c:404] missing compat32 library libcudadebugger.so
W0515 14:37:36.824694 2 nvc_info.c:404] missing compat32 library libnvidia-opencl.so
W0515 14:37:36.824698 2 nvc_info.c:404] missing compat32 library libnvidia-ptxjitcompiler.so
W0515 14:37:36.824702 2 nvc_info.c:404] missing compat32 library libnvidia-fatbinaryloader.so
W0515 14:37:36.824705 2 nvc_info.c:404] missing compat32 library libnvidia-allocator.so
W0515 14:37:36.824709 2 nvc_info.c:404] missing compat32 library libnvidia-compiler.so
W0515 14:37:36.824712 2 nvc_info.c:404] missing compat32 library libnvidia-pkcs11.so
W0515 14:37:36.824716 2 nvc_info.c:404] missing compat32 library libnvidia-nvvm.so
W0515 14:37:36.824720 2 nvc_info.c:404] missing compat32 library libnvidia-ngx.so
W0515 14:37:36.824723 2 nvc_info.c:404] missing compat32 library libvdpau_nvidia.so
W0515 14:37:36.824727 2 nvc_info.c:404] missing compat32 library libnvidia-encode.so
W0515 14:37:36.824730 2 nvc_info.c:404] missing compat32 library libnvidia-opticalflow.so
W0515 14:37:36.824734 2 nvc_info.c:404] missing compat32 library libnvcuvid.so
W0515 14:37:36.824738 2 nvc_info.c:404] missing compat32 library libnvidia-eglcore.so
W0515 14:37:36.824741 2 nvc_info.c:404] missing compat32 library libnvidia-glcore.so
W0515 14:37:36.824745 2 nvc_info.c:404] missing compat32 library libnvidia-tls.so
W0515 14:37:36.824748 2 nvc_info.c:404] missing compat32 library libnvidia-glsi.so
W0515 14:37:36.824752 2 nvc_info.c:404] missing compat32 library libnvidia-fbc.so
W0515 14:37:36.824756 2 nvc_info.c:404] missing compat32 library libnvidia-ifr.so
W0515 14:37:36.824759 2 nvc_info.c:404] missing compat32 library libnvidia-rtcore.so
W0515 14:37:36.824763 2 nvc_info.c:404] missing compat32 library libnvoptix.so
W0515 14:37:36.824767 2 nvc_info.c:404] missing compat32 library libGLX_nvidia.so
W0515 14:37:36.824770 2 nvc_info.c:404] missing compat32 library libEGL_nvidia.so
W0515 14:37:36.824774 2 nvc_info.c:404] missing compat32 library libGLESv2_nvidia.so
W0515 14:37:36.824777 2 nvc_info.c:404] missing compat32 library libGLESv1_CM_nvidia.so
W0515 14:37:36.824781 2 nvc_info.c:404] missing compat32 library libnvidia-glvkspirv.so
W0515 14:37:36.824785 2 nvc_info.c:404] missing compat32 library libnvidia-cbl.so
I0515 14:37:36.824968 2 nvc_info.c:300] selecting /usr/lib/nvidia/current/nvidia-smi
I0515 14:37:36.824996 2 nvc_info.c:300] selecting /usr/lib/nvidia/current/nvidia-debugdump
I0515 14:37:36.825009 2 nvc_info.c:300] selecting /usr/bin/nvidia-persistenced
I0515 14:37:36.825021 2 nvc_info.c:300] selecting /usr/bin/nv-fabricmanager
I0515 14:37:36.825033 2 nvc_info.c:300] selecting /usr/bin/nvidia-cuda-mps-control
I0515 14:37:36.825045 2 nvc_info.c:300] selecting /usr/bin/nvidia-cuda-mps-server
I0515 14:37:36.825090 2 nvc_info.c:486] listing firmware path /lib/firmware/nvidia/530.30.02/gsp_ga10x.bin
I0515 14:37:36.825098 2 nvc_info.c:486] listing firmware path /lib/firmware/nvidia/530.30.02/gsp_tu10x.bin
I0515 14:37:36.825116 2 nvc_info.c:559] listing device /dev/nvidiactl
I0515 14:37:36.825120 2 nvc_info.c:559] listing device /dev/nvidia-uvm
I0515 14:37:36.825123 2 nvc_info.c:559] listing device /dev/nvidia-uvm-tools
I0515 14:37:36.825127 2 nvc_info.c:559] listing device /dev/nvidia-modeset
I0515 14:37:36.825148 2 nvc_info.c:344] listing ipc path /run/nvidia-persistenced/socket
W0515 14:37:36.825166 2 nvc_info.c:350] missing ipc path /var/run/nvidia-fabricmanager/socket
W0515 14:37:36.825177 2 nvc_info.c:350] missing ipc path /tmp/nvidia-mps
I0515 14:37:36.825181 2 nvc_info.c:852] requesting device information with ''
I0515 14:37:36.842628 2 nvc_info.c:743] listing device /dev/nvidia0 (GPU-595a802e-9268-e1f8-cad5-9a69202e4cd5 at 00000000:ca:00.0)
I0515 14:37:36.842699 2 nvc_mount.c:366] mounting tmpfs at /usr/lib/x86_64-linux-gnu/lxc/rootfs/proc/driver/nvidia
E0515 14:37:36.842858 2 utils.c:529] The path /usr/lib/x86_64-linux-gnu/lxc/rootfs/usr/bin alreay exists with the required mode; skipping create
E0515 14:37:36.843111 2 utils.c:529] The path /usr/lib/x86_64-linux-gnu/lxc/rootfs/usr/bin/nvidia-smi alreay exists with the required mode; skipping create
I0515 14:37:36.843117 2 nvc_mount.c:134] mounting /usr/lib/nvidia/current/nvidia-smi at /usr/lib/x86_64-linux-gnu/lxc/rootfs/usr/bin/nvidia-smi
E0515 14:37:36.843185 2 utils.c:529] The path /usr/lib/x86_64-linux-gnu/lxc/rootfs/usr/bin/nvidia-debugdump alreay exists with the required mode; skipping create
I0515 14:37:36.843189 2 nvc_mount.c:134] mounting /usr/lib/nvidia/current/nvidia-debugdump at /usr/lib/x86_64-linux-gnu/lxc/rootfs/usr/bin/nvidia-debugdump
E0515 14:37:36.843251 2 utils.c:529] The path /usr/lib/x86_64-linux-gnu/lxc/rootfs/usr/bin/nvidia-persistenced alreay exists with the required mode; skipping create
I0515 14:37:36.843255 2 nvc_mount.c:134] mounting /usr/bin/nvidia-persistenced at /usr/lib/x86_64-linux-gnu/lxc/rootfs/usr/bin/nvidia-persistenced
I0515 14:37:36.844362 2 nvc_mount.c:134] mounting /usr/bin/nv-fabricmanager at /usr/lib/x86_64-linux-gnu/lxc/rootfs/usr/bin/nv-fabricmanager
E0515 14:37:36.844434 2 utils.c:529] The path /usr/lib/x86_64-linux-gnu/lxc/rootfs/usr/bin/nvidia-cuda-mps-control alreay exists with the required mode; skipping create
I0515 14:37:36.844438 2 nvc_mount.c:134] mounting /usr/bin/nvidia-cuda-mps-control at /usr/lib/x86_64-linux-gnu/lxc/rootfs/usr/bin/nvidia-cuda-mps-control
E0515 14:37:36.844455 2 utils.c:529] The path /usr/lib/x86_64-linux-gnu/lxc/rootfs/usr/bin/nvidia-cuda-mps-server alreay exists with the required mode; skipping create
I0515 14:37:36.844459 2 nvc_mount.c:134] mounting /usr/bin/nvidia-cuda-mps-server at /usr/lib/x86_64-linux-gnu/lxc/rootfs/usr/bin/nvidia-cuda-mps-server
E0515 14:37:36.844767 2 utils.c:529] The path /usr/lib/x86_64-linux-gnu/lxc/rootfs/usr/lib/x86_64-linux-gnu alreay exists with the required mode; skipping create
E0515 14:37:36.844920 2 utils.c:529] The path /usr/lib/x86_64-linux-gnu/lxc/rootfs/usr/lib/x86_64-linux-gnu/libnvidia-ml.so.530.30.02 alreay exists with the required mode; skipping create
I0515 14:37:36.844925 2 nvc_mount.c:134] mounting /usr/lib/x86_64-linux-gnu/nvidia/current/libnvidia-ml.so.530.30.02 at /usr/lib/x86_64-linux-gnu/lxc/rootfs/usr/lib/x86_64-linux-gnu/libnvidia-ml.so.530.30.02
E0515 14:37:36.844969 2 utils.c:529] The path /usr/lib/x86_64-linux-gnu/lxc/rootfs/usr/lib/x86_64-linux-gnu/libnvidia-cfg.so.530.30.02 alreay exists with the required mode; skipping create
I0515 14:37:36.844973 2 nvc_mount.c:134] mounting /usr/lib/x86_64-linux-gnu/nvidia/current/libnvidia-cfg.so.530.30.02 at /usr/lib/x86_64-linux-gnu/lxc/rootfs/usr/lib/x86_64-linux-gnu/libnvidia-cfg.so.530.30.02
E0515 14:37:36.844988 2 utils.c:529] The path /usr/lib/x86_64-linux-gnu/lxc/rootfs/usr/lib/x86_64-linux-gnu/libcuda.so.530.30.02 alreay exists with the required mode; skipping create
I0515 14:37:36.844992 2 nvc_mount.c:134] mounting /usr/lib/x86_64-linux-gnu/nvidia/current/libcuda.so.530.30.02 at /usr/lib/x86_64-linux-gnu/lxc/rootfs/usr/lib/x86_64-linux-gnu/libcuda.so.530.30.02
E0515 14:37:36.845120 2 utils.c:529] The path /usr/lib/x86_64-linux-gnu/lxc/rootfs/usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.530.30.02 alreay exists with the required mode; skipping create
I0515 14:37:36.845124 2 nvc_mount.c:134] mounting /usr/lib/x86_64-linux-gnu/nvidia/current/libnvidia-opencl.so.530.30.02 at /usr/lib/x86_64-linux-gnu/lxc/rootfs/usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.530.30.02
E0515 14:37:36.845166 2 utils.c:529] The path /usr/lib/x86_64-linux-gnu/lxc/rootfs/usr/lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so.530.30.02 alreay exists with the required mode; skipping create
I0515 14:37:36.845170 2 nvc_mount.c:134] mounting /usr/lib/x86_64-linux-gnu/nvidia/current/libnvidia-ptxjitcompiler.so.530.30.02 at /usr/lib/x86_64-linux-gnu/lxc/rootfs/usr/lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so.530.30.02
E0515 14:37:36.845186 2 utils.c:529] The path /usr/lib/x86_64-linux-gnu/lxc/rootfs/usr/lib/x86_64-linux-gnu/libnvidia-compiler.so.530.30.02 alreay exists with the required mode; skipping create
I0515 14:37:36.845190 2 nvc_mount.c:134] mounting /usr/lib/x86_64-linux-gnu/libnvidia-compiler.so.530.30.02 at /usr/lib/x86_64-linux-gnu/lxc/rootfs/usr/lib/x86_64-linux-gnu/libnvidia-compiler.so.530.30.02
E0515 14:37:36.845205 2 utils.c:529] The path /usr/lib/x86_64-linux-gnu/lxc/rootfs/usr/lib/x86_64-linux-gnu/libnvidia-nvvm.so.530.30.02 alreay exists with the required mode; skipping create
I0515 14:37:36.845209 2 nvc_mount.c:134] mounting /usr/lib/x86_64-linux-gnu/nvidia/current/libnvidia-nvvm.so.530.30.02 at /usr/lib/x86_64-linux-gnu/lxc/rootfs/usr/lib/x86_64-linux-gnu/libnvidia-nvvm.so.530.30.02
I0515 14:37:36.845223 2 nvc_mount.c:527] creating symlink /usr/lib/x86_64-linux-gnu/lxc/rootfs/usr/lib/x86_64-linux-gnu/libcuda.so -> libcuda.so.1
E0515 14:37:36.845596 2 utils.c:529] The path /usr/lib/x86_64-linux-gnu/lxc/rootfs/lib/firmware/nvidia/530.30.02/gsp_ga10x.bin alreay exists with the required mode; skipping create
I0515 14:37:36.845601 2 nvc_mount.c:85] mounting /usr/lib/firmware/nvidia/530.30.02/gsp_ga10x.bin at /usr/lib/x86_64-linux-gnu/lxc/rootfs/lib/firmware/nvidia/530.30.02/gsp_ga10x.bin with flags 0x7
E0515 14:37:36.845629 2 utils.c:529] The path /usr/lib/x86_64-linux-gnu/lxc/rootfs/lib/firmware/nvidia/530.30.02/gsp_tu10x.bin alreay exists with the required mode; skipping create
I0515 14:37:36.845633 2 nvc_mount.c:85] mounting /usr/lib/firmware/nvidia/530.30.02/gsp_tu10x.bin at /usr/lib/x86_64-linux-gnu/lxc/rootfs/lib/firmware/nvidia/530.30.02/gsp_tu10x.bin with flags 0x7
I0515 14:37:36.846135 2 nvc_mount.c:261] mounting /run/nvidia-persistenced/socket at /usr/lib/x86_64-linux-gnu/lxc/rootfs/run/nvidia-persistenced/socket
I0515 14:37:36.846169 2 nvc_mount.c:230] mounting /dev/nvidiactl at /usr/lib/x86_64-linux-gnu/lxc/rootfs/dev/nvidiactl
I0515 14:37:36.846201 2 nvc_mount.c:230] mounting /dev/nvidia-uvm at /usr/lib/x86_64-linux-gnu/lxc/rootfs/dev/nvidia-uvm
I0515 14:37:36.846225 2 nvc_mount.c:230] mounting /dev/nvidia-uvm-tools at /usr/lib/x86_64-linux-gnu/lxc/rootfs/dev/nvidia-uvm-tools
I0515 14:37:36.846262 2 nvc_mount.c:230] mounting /dev/nvidia0 at /usr/lib/x86_64-linux-gnu/lxc/rootfs/dev/nvidia0
I0515 14:37:36.846311 2 nvc_mount.c:440] mounting /proc/driver/nvidia/gpus/0000:ca:00.0 at /usr/lib/x86_64-linux-gnu/lxc/rootfs/proc/driver/nvidia/gpus/0000:ca:00.0
I0515 14:37:36.846329 2 nvc_ldcache.c:380] executing /usr/sbin/ldconfig from host at /usr/lib/x86_64-linux-gnu/lxc/rootfs
W0515 14:37:36.854486 2 utils.c:121] /usr/sbin/ldconfig: File /usr/lib/x86_64-linux-gnu/libnvidia-allocator.so.1 is empty, not checked.
W0515 14:37:36.867756 2 utils.c:121] /usr/sbin/ldconfig: File /usr/lib/x86_64-linux-gnu/libnvidia-allocator.so.530.30.02 is empty, not checked.
I0515 14:37:36.888002 2 nvc.c:434] shutting down library context
I0515 14:37:36.888084 19 rpc.c:95] terminating nvcgo rpc service
I0515 14:37:36.888656 2 rpc.c:135] nvcgo rpc service terminated successfully
I0515 14:37:36.891619 18 rpc.c:95] terminating driver rpc service
I0515 14:37:36.891780 2 rpc.c:135] driver rpc service terminated successfully

Proxmox LXC config of starting container:

arch: amd64
cores: 1
debug: 1
features: nesting=1
hostname: gpu-test
memory: 1024
net0: name=eth0,bridge=vmbr0,firewall=1,hwaddr=BE:E3:B2:A5:1C:78,ip=dhcp,type=veth
ostype: ubuntu
rootfs: sme.disks:103/vm-103-disk-0.raw,size=8G
swap: 512
unprivileged: 1
lxc.environment: NVIDIA_DRIVER_CAPABILITIES=compute,utility
lxc.environment: NVIDIA_VISIBLE_DEVICES=all

Container Start logs of container with specific mig device:

run_buffer: 322 Script exited with status 1
lxc_setup: 4437 Failed to run mount hooks
do_start: 1272 Failed to setup container "103"
sync_wait: 34 An error occurred in another process (expected sequence number 4)
__lxc_start: 2107 Failed to spawn container "103"
tart 103 20230515151041.988 INFO     conf - ../src/lxc/conf.c:run_script_argv:338 - Executing script "/usr/share/lxc/hooks/lxc-pve-prestart-hook" for container "103", config section "lxc"
INFO     cgfsng - ../src/lxc/cgroups/cgfsng.c:unpriv_systemd_create_scope:1227 - Running privileged, not using a systemd unit
DEBUG    seccomp - ../src/lxc/seccomp.c:parse_config_v2:656 - Host native arch is [3221225534]
INFO     seccomp - ../src/lxc/seccomp.c:parse_config_v2:807 - Processing "reject_force_umount  # comment this to allow umount -f;  not recommended"
INFO     seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:524 - Set seccomp rule to reject force umounts
INFO     seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:524 - Set seccomp rule to reject force umounts
INFO     seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:524 - Set seccomp rule to reject force umounts
INFO     seccomp - ../src/lxc/seccomp.c:parse_config_v2:807 - Processing "[all]"
INFO     seccomp - ../src/lxc/seccomp.c:parse_config_v2:807 - Processing "kexec_load errno 1"
INFO     seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:564 - Adding native rule for syscall[246:kexec_load] action[327681:errno] arch[0]
INFO     seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:564 - Adding compat rule for syscall[246:kexec_load] action[327681:errno] arch[1073741827]
INFO     seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:564 - Adding compat rule for syscall[246:kexec_load] action[327681:errno] arch[1073741886]
INFO     seccomp - ../src/lxc/seccomp.c:parse_config_v2:807 - Processing "open_by_handle_at errno 1"
INFO     seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:564 - Adding native rule for syscall[304:open_by_handle_at] action[327681:errno] arch[0]
INFO     seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:564 - Adding compat rule for syscall[304:open_by_handle_at] action[327681:errno] arch[1073741827]
INFO     seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:564 - Adding compat rule for syscall[304:open_by_handle_at] action[327681:errno] arch[1073741886]
INFO     seccomp - ../src/lxc/seccomp.c:parse_config_v2:807 - Processing "init_module errno 1"
INFO     seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:564 - Adding native rule for syscall[175:init_module] action[327681:errno] arch[0]
INFO     seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:564 - Adding compat rule for syscall[175:init_module] action[327681:errno] arch[1073741827]
INFO     seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:564 - Adding compat rule for syscall[175:init_module] action[327681:errno] arch[1073741886]
INFO     seccomp - ../src/lxc/seccomp.c:parse_config_v2:807 - Processing "finit_module errno 1"
INFO     seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:564 - Adding native rule for syscall[313:finit_module] action[327681:errno] arch[0]
INFO     seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:564 - Adding compat rule for syscall[313:finit_module] action[327681:errno] arch[1073741827]
INFO     seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:564 - Adding compat rule for syscall[313:finit_module] action[327681:errno] arch[1073741886]
INFO     seccomp - ../src/lxc/seccomp.c:parse_config_v2:807 - Processing "delete_module errno 1"
INFO     seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:564 - Adding native rule for syscall[176:delete_module] action[327681:errno] arch[0]
INFO     seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:564 - Adding compat rule for syscall[176:delete_module] action[327681:errno] arch[1073741827]
INFO     seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:564 - Adding compat rule for syscall[176:delete_module] action[327681:errno] arch[1073741886]
INFO     seccomp - ../src/lxc/seccomp.c:parse_config_v2:807 - Processing "ioctl errno 1 [1,0x9400,SCMP_CMP_MASKED_EQ,0xff00]"
INFO     seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:547 - arg_cmp[0]: SCMP_CMP(1, 7, 65280, 37888)
INFO     seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:564 - Adding native rule for syscall[16:ioctl] action[327681:errno] arch[0]
INFO     seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:547 - arg_cmp[0]: SCMP_CMP(1, 7, 65280, 37888)
INFO     seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:564 - Adding compat rule for syscall[16:ioctl] action[327681:errno] arch[1073741827]
INFO     seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:547 - arg_cmp[0]: SCMP_CMP(1, 7, 65280, 37888)
INFO     seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:564 - Adding compat rule for syscall[16:ioctl] action[327681:errno] arch[1073741886]
INFO     seccomp - ../src/lxc/seccomp.c:parse_config_v2:807 - Processing "keyctl errno 38"
INFO     seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:564 - Adding native rule for syscall[250:keyctl] action[327718:errno] arch[0]
INFO     seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:564 - Adding compat rule for syscall[250:keyctl] action[327718:errno] arch[1073741827]
INFO     seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:564 - Adding compat rule for syscall[250:keyctl] action[327718:errno] arch[1073741886]
INFO     seccomp - ../src/lxc/seccomp.c:parse_config_v2:1017 - Merging compat seccomp contexts into main context
INFO     start - ../src/lxc/start.c:lxc_init:881 - Container "103" is initialized
INFO     cgfsng - ../src/lxc/cgroups/cgfsng.c:cgfsng_monitor_create:1391 - The monitor process uses "lxc.monitor/103" as cgroup
DEBUG    storage - ../src/lxc/storage/storage.c:storage_query:231 - Detected rootfs type "dir"
DEBUG    storage - ../src/lxc/storage/storage.c:storage_query:231 - Detected rootfs type "dir"
INFO     cgfsng - ../src/lxc/cgroups/cgfsng.c:cgfsng_payload_create:1499 - The container process uses "lxc/103/ns" as inner and "lxc/103" as limit cgroup
INFO     start - ../src/lxc/start.c:lxc_spawn:1762 - Cloned CLONE_NEWUSER
INFO     start - ../src/lxc/start.c:lxc_spawn:1762 - Cloned CLONE_NEWNS
INFO     start - ../src/lxc/start.c:lxc_spawn:1762 - Cloned CLONE_NEWPID
INFO     start - ../src/lxc/start.c:lxc_spawn:1762 - Cloned CLONE_NEWUTS
INFO     start - ../src/lxc/start.c:lxc_spawn:1762 - Cloned CLONE_NEWIPC
INFO     start - ../src/lxc/start.c:lxc_spawn:1762 - Cloned CLONE_NEWCGROUP
DEBUG    start - ../src/lxc/start.c:lxc_try_preserve_namespace:139 - Preserved user namespace via fd 17 and stashed path as user:/proc/45111/fd/17
DEBUG    start - ../src/lxc/start.c:lxc_try_preserve_namespace:139 - Preserved mnt namespace via fd 18 and stashed path as mnt:/proc/45111/fd/18
DEBUG    start - ../src/lxc/start.c:lxc_try_preserve_namespace:139 - Preserved pid namespace via fd 19 and stashed path as pid:/proc/45111/fd/19
DEBUG    start - ../src/lxc/start.c:lxc_try_preserve_namespace:139 - Preserved uts namespace via fd 20 and stashed path as uts:/proc/45111/fd/20
DEBUG    start - ../src/lxc/start.c:lxc_try_preserve_namespace:139 - Preserved ipc namespace via fd 21 and stashed path as ipc:/proc/45111/fd/21
DEBUG    start - ../src/lxc/start.c:lxc_try_preserve_namespace:139 - Preserved cgroup namespace via fd 22 and stashed path as cgroup:/proc/45111/fd/22
DEBUG    conf - ../src/lxc/conf.c:idmaptool_on_path_and_privileged:3549 - The binary "/usr/bin/newuidmap" does have the setuid bit set
DEBUG    conf - ../src/lxc/conf.c:idmaptool_on_path_and_privileged:3549 - The binary "/usr/bin/newgidmap" does have the setuid bit set
DEBUG    conf - ../src/lxc/conf.c:lxc_map_ids:3634 - Functional newuidmap and newgidmap binary found
INFO     cgfsng - ../src/lxc/cgroups/cgfsng.c:cgfsng_setup_limits:3251 - Limits for the unified cgroup hierarchy have been setup
DEBUG    conf - ../src/lxc/conf.c:idmaptool_on_path_and_privileged:3549 - The binary "/usr/bin/newuidmap" does have the setuid bit set
DEBUG    conf - ../src/lxc/conf.c:idmaptool_on_path_and_privileged:3549 - The binary "/usr/bin/newgidmap" does have the setuid bit set
INFO     conf - ../src/lxc/conf.c:lxc_map_ids:3632 - Caller maps host root. Writing mapping directly
NOTICE   utils - ../src/lxc/utils.c:lxc_drop_groups:1367 - Dropped supplimentary groups
INFO     start - ../src/lxc/start.c:do_start:1104 - Unshared CLONE_NEWNET
NOTICE   utils - ../src/lxc/utils.c:lxc_drop_groups:1367 - Dropped supplimentary groups
NOTICE   utils - ../src/lxc/utils.c:lxc_switch_uid_gid:1343 - Switched to gid 0
NOTICE   utils - ../src/lxc/utils.c:lxc_switch_uid_gid:1352 - Switched to uid 0
DEBUG    start - ../src/lxc/start.c:lxc_try_preserve_namespace:139 - Preserved net namespace via fd 5 and stashed path as net:/proc/45111/fd/5
INFO     conf - ../src/lxc/conf.c:run_script_argv:338 - Executing script "/usr/share/lxc/lxcnetaddbr" for container "103", config section "net"
DEBUG    network - ../src/lxc/network.c:netdev_configure_server_veth:852 - Instantiated veth tunnel "veth103i0 <--> veththxrKl"
DEBUG    conf - ../src/lxc/conf.c:lxc_mount_rootfs:1437 - Mounted rootfs "/var/lib/lxc/103/rootfs" onto "/usr/lib/x86_64-linux-gnu/lxc/rootfs" with options "(null)"
INFO     conf - ../src/lxc/conf.c:setup_utsname:876 - Set hostname to "gpu-test"
DEBUG    network - ../src/lxc/network.c:setup_hw_addr:3821 - Mac address "BE:E3:B2:A5:1C:78" on "eth0" has been setup
DEBUG    network - ../src/lxc/network.c:lxc_network_setup_in_child_namespaces_common:3962 - Network device "eth0" has been setup
INFO     network - ../src/lxc/network.c:lxc_setup_network_in_child_namespaces:4019 - Finished setting up network devices with caller assigned names
INFO     conf - ../src/lxc/conf.c:mount_autodev:1220 - Preparing "/dev"
INFO     conf - ../src/lxc/conf.c:mount_autodev:1281 - Prepared "/dev"
DEBUG    conf - ../src/lxc/conf.c:lxc_mount_auto_mounts:736 - Invalid argument - Tried to ensure procfs is unmounted
DEBUG    conf - ../src/lxc/conf.c:lxc_mount_auto_mounts:759 - Invalid argument - Tried to ensure sysfs is unmounted
DEBUG    conf - ../src/lxc/conf.c:mount_entry:2445 - Remounting "/sys/fs/fuse/connections" on "/usr/lib/x86_64-linux-gnu/lxc/rootfs/sys/fs/fuse/connections" to respect bind or remount options
DEBUG    conf - ../src/lxc/conf.c:mount_entry:2464 - Flags for "/sys/fs/fuse/connections" were 4110, required extra flags are 14
DEBUG    conf - ../src/lxc/conf.c:mount_entry:2508 - Mounted "/sys/fs/fuse/connections" on "/usr/lib/x86_64-linux-gnu/lxc/rootfs/sys/fs/fuse/connections" with filesystem type "none"
DEBUG    conf - ../src/lxc/conf.c:mount_entry:2445 - Remounting "/sys/kernel/debug" on "/usr/lib/x86_64-linux-gnu/lxc/rootfs/sys/kernel/debug" to respect bind or remount options
DEBUG    conf - ../src/lxc/conf.c:mount_entry:2464 - Flags for "/sys/kernel/debug" were 4110, required extra flags are 14
DEBUG    conf - ../src/lxc/conf.c:mount_entry:2508 - Mounted "/sys/kernel/debug" on "/usr/lib/x86_64-linux-gnu/lxc/rootfs/sys/kernel/debug" with filesystem type "none"
DEBUG    conf - ../src/lxc/conf.c:mount_entry:2445 - Remounting "/sys/kernel/security" on "/usr/lib/x86_64-linux-gnu/lxc/rootfs/sys/kernel/security" to respect bind or remount options
DEBUG    conf - ../src/lxc/conf.c:mount_entry:2464 - Flags for "/sys/kernel/security" were 4110, required extra flags are 14
DEBUG    conf - ../src/lxc/conf.c:mount_entry:2508 - Mounted "/sys/kernel/security" on "/usr/lib/x86_64-linux-gnu/lxc/rootfs/sys/kernel/security" with filesystem type "none"
DEBUG    conf - ../src/lxc/conf.c:mount_entry:2445 - Remounting "/sys/fs/pstore" on "/usr/lib/x86_64-linux-gnu/lxc/rootfs/sys/fs/pstore" to respect bind or remount options
DEBUG    conf - ../src/lxc/conf.c:mount_entry:2464 - Flags for "/sys/fs/pstore" were 4110, required extra flags are 14
DEBUG    conf - ../src/lxc/conf.c:mount_entry:2508 - Mounted "/sys/fs/pstore" on "/usr/lib/x86_64-linux-gnu/lxc/rootfs/sys/fs/pstore" with filesystem type "none"
DEBUG    conf - ../src/lxc/conf.c:mount_entry:2508 - Mounted "mqueue" on "/usr/lib/x86_64-linux-gnu/lxc/rootfs/dev/mqueue" with filesystem type "mqueue"
DEBUG    conf - ../src/lxc/conf.c:mount_entry:2445 - Remounting "/sys/firmware/efi/efivars" on "/usr/lib/x86_64-linux-gnu/lxc/rootfs/sys/firmware/efi/efivars" to respect bind or remount options
DEBUG    conf - ../src/lxc/conf.c:mount_entry:2464 - Flags for "/sys/firmware/efi/efivars" were 4110, required extra flags are 14
DEBUG    conf - ../src/lxc/conf.c:mount_entry:2508 - Mounted "/sys/firmware/efi/efivars" on "/usr/lib/x86_64-linux-gnu/lxc/rootfs/sys/firmware/efi/efivars" with filesystem type "none"
DEBUG    conf - ../src/lxc/conf.c:mount_entry:2445 - Remounting "/proc/sys/fs/binfmt_misc" on "/usr/lib/x86_64-linux-gnu/lxc/rootfs/proc/sys/fs/binfmt_misc" to respect bind or remount options
DEBUG    conf - ../src/lxc/conf.c:mount_entry:2464 - Flags for "/proc/sys/fs/binfmt_misc" were 4110, required extra flags are 14
DEBUG    conf - ../src/lxc/conf.c:mount_entry:2508 - Mounted "/proc/sys/fs/binfmt_misc" on "/usr/lib/x86_64-linux-gnu/lxc/rootfs/proc/sys/fs/binfmt_misc" with filesystem type "none"
DEBUG    conf - ../src/lxc/conf.c:mount_entry:2508 - Mounted "proc" on "/usr/lib/x86_64-linux-gnu/lxc/rootfs/dev/.lxc/proc" with filesystem type "proc"
DEBUG    conf - ../src/lxc/conf.c:mount_entry:2508 - Mounted "sys" on "/usr/lib/x86_64-linux-gnu/lxc/rootfs/dev/.lxc/sys" with filesystem type "sysfs"
DEBUG    cgfsng - ../src/lxc/cgroups/cgfsng.c:__cgroupfs_mount:1909 - Mounted cgroup filesystem cgroup2 onto 19((null))
INFO     conf - ../src/lxc/conf.c:run_script_argv:338 - Executing script "/usr/share/lxc/hooks/nvidia" for container "103", config section "lxc"
DEBUG    conf - ../src/lxc/conf.c:run_buffer:311 - Script exec /usr/share/lxc/hooks/nvidia 103 lxc mount produced output: INFO: Writing nvidia-container-cli log at /var/lib/lxc/103/rootfs/nvidia.log.

DEBUG    conf - ../src/lxc/conf.c:run_buffer:311 - Script exec /usr/share/lxc/hooks/nvidia 103 lxc mount produced output: + exec nvidia-container-cli --debug=/var/lib/lxc/103/rootfs/nvidia.log --user configure --no-cgroups --ldconfig=@/usr/sbin/ldconfig --device=MIG-5a62f918-fd78-5a32-9614-008a1471bf61 --compute --utility /usr/lib/x86_64-linux-gnu/lxc/rootfs

DEBUG    conf - ../src/lxc/conf.c:run_buffer:311 - Script exec /usr/share/lxc/hooks/nvidia 103 lxc mount produced output: nvidia-container-cli: device error: MIG-5a62f918-fd78-5a32-9614-008a1471bf61: unknown device

ERROR    conf - ../src/lxc/conf.c:run_buffer:322 - Script exited with status 1
ERROR    conf - ../src/lxc/conf.c:lxc_setup:4437 - Failed to run mount hooks
ERROR    start - ../src/lxc/start.c:do_start:1272 - Failed to setup container "103"
ERROR    sync - ../src/lxc/sync.c:sync_wait:34 - An error occurred in another process (expected sequence number 4)
DEBUG    network - ../src/lxc/network.c:lxc_delete_network:4173 - Deleted network devices
ERROR    start - ../src/lxc/start.c:__lxc_start:2107 - Failed to spawn container "103"
WARN     start - ../src/lxc/start.c:lxc_abort:1036 - No such process - Failed to send SIGKILL via pidfd 16 for process 45131
TASK ERROR: startup for container '103' failed

nvidia.log is the same for some reason

Config with specific device:

arch: amd64
cores: 1
debug: 1
features: nesting=1
hostname: gpu-test
memory: 1024
net0: name=eth0,bridge=vmbr0,firewall=1,hwaddr=BE:E3:B2:A5:1C:78,ip=dhcp,type=veth
ostype: ubuntu
rootfs: sme.disks:103/vm-103-disk-0.raw,size=8G
swap: 512
unprivileged: 1
lxc.environment: NVIDIA_VISIBLE_DEVICES=MIG-5a62f918-fd78-5a32-9614-008a1471bf61
lxc.environment: NVIDIA_DRIVER_CAPABILITIES=compute,utility
1 Like

I found my issue, see Github Issue. From another Guide somewhere on the internet, I had the following udev rule:

KERNEL=="nvidia", RUN+="/bin/bash -c '/usr/bin/nvidia-smi -L && /bin/chmod 666 /dev/nvidia* && /usr/bin/nvidia-modprobe -c0 -u && /bin/chmod 0666 /dev/nvidia-uvm*'"

After removing that and just restarting it worked!

So takeaway is: do not fiddle with the nvidia devices and don’t run nvidia-modprobe yourself.