I have a LXD installed on an ArchLinux host with a container running Ubuntu-22.04. The container is setup to use docker/portainer. I want to use a GPU installed on the host.
$ docker run --gpus all nvidia/cuda:11.4.0-devel-ubuntu20.04 nvidia-smi
docker: Error response from daemon: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy'
nvidia-container-cli: mount error: failed to add device rules: unable to find any existing device filters attached to the cgroup: bpf_prog_query(BPF_CGROUP_DEVICE) failed: operation not permitted: unknown.
ERRO[0000] error waiting for container: context canceled
The nvidia package installed on the container are,
# apt search nvidia|grep installed
WARNING: apt does not have a stable CLI interface. Use with caution in scripts.
libnvidia-container-tools/bionic,now 1.10.0-1 amd64 [installed,automatic]
libnvidia-container1/bionic,now 1.10.0-1 amd64 [installed,automatic]
nvidia-container-toolkit/bionic,now 1.10.0-1 amd64 [installed]
Is it possible to nest a GPU from host>LXD>docker?
Hmm, the error suggests a devices cgroup issue, I wonder if using security.syscalls.intercept.bpf=true and security.syscalls.intercept.bpf.devices=true may help here.