Hi,
I am tinkering with running Docker containers inside LXD containers (based on Stephane’s excellent video on the LXD channel: https://youtu.be/_fCSSEyiGro).
The “typical” Docker containers work perfectly fine for me, but I wanted to play with the more complex ones, like the CUDA/PyTorch containers e.g. from Nvidia NGC nvcr.io registry.
One issue is well known, i.e. security.privileged=true and nvidia.runtime=true still do not work well together, but the workaround (i.e. installing the nvidia-drivers within LXD container) works fine.
But I was wondering, if instead of using the bulky security.privileged=true option to get those CUDA containers running, maybe some more fine-grained security settings (syscalls, caps, …) would work in this case too ? Similarly to Stephane’s instructions of running “non-nvidia” Docker containers inside an unprivileged LXD container with: security.syscalls.intercept.mknod=true and security.syscalls.intercept.setxattr=true ?
When I unset the security.privileged option in my LXD container (in which the “CUDA” Docker containers work perfectly fine, if it is configured as privileged), trying to run for example:
docker run --rm --gpus all --ipc=host nvidia/cuda:11.4.1-base-ubuntu20.04 nvidia-smi
gives:
docker: Error response from daemon: OCI runtime create failed: container_linux.go:380: starting container process caused: process_linux.go:545: container init caused: Running hook #0:: error running hook: exit status 1, stdout: , stderr: nvidia-container-cli: mount error: write error: /sys/fs/cgroup/devices/docker/098ad8bf1fdcf4ab72091864933fbc8b67a8f0b30746681ba6ef4082c23245b9/devices.allow: operation not permitted: unknown.
Or is the security.privileged=true the only way to go ?
Thanks,
Waldek