Great. The GPU works fine after following the steps mentioned previously.
libnvidia-container hardcodes an expectation that nvidia-smi is in /usr/bin, which is not a valid NixOS assumption. There have been some tweaks in our libnvidia-container package recently, but Iām not sure yet if theyāll fix the problem of libnvidia-container failing to find the binaries. Iāll check back in when I can confirm either way, but the changes made will unlikely be backported to stable 24.11.
@stgraber one thing I noticed when debugging this is that the nvidia hook was failing to create /var/lib/incus/storage-pools/default/containers/noble-molly/hook
, when I created it and made the permissions wide open, I notice that the hook is running as the containerās root UID and not the hostās. This prevents libnvidia-container from writing its log file.
Have you seen this before?
I suspect thatās normal, the hook was written for LXC and so expects a path like /var/lib/lxc/NAME where it can have write access.
Under Incus weāve tightened permissions a fair bit more so thatās causing this issue.
Is that fatal though or just prevents logging?
It only prevents logging from what Iāve seen. The logging was helpful for some of the troubleshooting Iām doing, but I can just mkdir/chown during that.
@stgraber @adamcstephens I am trying to setup a container on another host. If i specify nvidia.runtime: "true"
container doesnt start.
$incus start dockerblr
Error: Failed to run: /nix/store/2ypj6mwrs14wzwf18avqx0nm5n8r41vg-incus-6.11.0/bin/incusd forkstart dockerblr /var/lib/incus/containers /run/incus/dockerblr/lxc.conf: exit status 1
Try `incus info --show-log dockerblr` for more info
$incus info --show-log dockerblr
Error: Invalid PID 'ļæ½'
My incus is setup as following,
#incus
virtualisation.incus.package = pkgs.incus;
virtualisation.incus.enable = true;
systemd.services.incus.environment.INCUS_LXC_HOOK =
"${config.virtualisation.incus.lxcPackage}/share/lxc/hooks";
Once i remove nvidia.runtime the container starts up fine.
Sorry, I isnāt have the bandwidth to look into this further right now. I donāt use this feature and itās difficult or impossible for us to write NixOS tests for given the hardware requirement.
Iād invite you to file an issue on the nixpkgs repo to track the problem, preferably with any more detail you can provide. Unfortunately, unless youāre willing/able to do the deep investigation yourself, I suspect little progress will be made.