I have a GPU application docker image. Now I want to run that docker container with GPU support inside an LXC container but I am unable to do so.
The way I run is this:
lxc launch ubuntu plex -c nvidia.runtime=true -c nvidia.driver.capabilities=compute,utility
lxc config set plex security.nesting true
lxc exec plex sudo apt-get update
lxc exec plex sudo apt-get install docker.io
lxc exec plex docker run ubuntu ls
lxc config device add plex gpu gpu id=0
lxc exec plex docker run --gpus all ubuntu ls
This last line execution of a docker container with gpus parameter returns this:
docker: Error response from daemon: could not select device driver "" with capabilities: [[gpu]].
ERRO[0000] error waiting for container: context canceled
I am able to see GPUs when I run nvidia-smi inside the LXC container.
Then I installed nvidia-container-toolkit in the LXC container and then again tried running container with gpus, I get the following error:
docker: Error response from daemon: OCI runtime create failed: container_linux.go:349: starting container process caused "process_linux.go:449: container init caused \"process_linux.go:432: running prestart hook 0 caused \\\"error running hook: exit status 1, stdout: , stderr: nvidia-container-cli: mount error: write error: /sys/fs/cgroup/devices/docker/41155b577716bc9b26bf64e1f930f40c5a81dbdd4eeb8d831f42c0b8728b00a5/devices.allow: operation not permitted\\\\n\\\"\"": unknown.
I have even tried setting security.privileged=true on the main LXC container. Then when I try to reboot the container after setting privileged condition, it doesn’t reboot and returns the following error:
Error: Failed to run: /snap/lxd/current/bin/lxd forkstart plex /var/snap/lxd/common/lxd/containers /var/snap/lxd/common/lxd/logs/plex/lxc.conf:
I am just scratching my head and still haven’t been able to figure out the solution to make it work. Don’t know what permissions are required for this to work. Any solution will be helpful.
Cheers!