Background: we are running SLURM (a cluster scheduler) and nvidia dockers inside an unprivileged LXD containers.
Setup:
Host: Ubuntu server 20.04
LXD: 4.0.7
LXD container: ubuntu 20.04
Configuration:
security.nesting: true
security.privileged: false
nvidia.runtime: true
Symptoms:
- The SLURM inside the container cannot use cgroup to limit resource consumption of a job. It issues error like “cannot write /sys/fs/cgroup/freezer/xxxx”.
- Cannot passthrough GPUs from the LXD container to nvidia docker containers, unless we configure nvidia-container-runtime inside the LXD container with “no-cgroups=true”. The default of nvidia-container-runtime is “no-cgroups=false”. Setting “no-cgroups=true” make GPU passing through from LXD to docker. This evidence that cgroup has some problems within the LXD container.
Questions:
- Is it possible for an unprivileged container to use cgroup inside?
- If the answer is yes, then how?
Trials to fix the problem:
We tried to run the LXD container still as a unprivileged but map id 0-65535 from the container to 0-65535 of the host. Still it has the cgroup problem. By the way, the idmap hacking allows the root inside the container to change file ownership.