I’ve been trying to make this work:
To run docker container inside LXC container with GPU support so that inside an LXC container I can do this:
docker run --gpus all -d image
and I get the allocated number of GPUs that are in my LXC container, inside my docker container too. But this just doesn’t work and I’m not sure why…
I ran this command:
lxc launch ubuntu:18.04 ml -c nvidia.runtime=true -c nvidia.driver.capabilities=compute,utility -c security.nesting=true -c security.privileged=true
This is the log I got:
lxc ml 20201123204717.700 WARN cgfsng - cgroups/cgfsng.c:mkdir_eexist_on_last:1152 - File exists - Failed to create directory "/sys/fs/cgroup/cpuset//lxc.monitor.ml" lxc ml 20201123204717.704 WARN cgfsng - cgroups/cgfsng.c:mkdir_eexist_on_last:1152 - File exists - Failed to create directory "/sys/fs/cgroup/cpuset//lxc.payload.ml" lxc ml 20201123204717.867 ERROR conf - conf.c:run_buffer:324 - Script exited with status 1 lxc ml 20201123204717.867 ERROR conf - conf.c:lxc_setup:3374 - Failed to run mount hooks lxc ml 20201123204717.867 ERROR start - start.c:do_start:1218 - Failed to setup container "ml" lxc ml 20201123204717.867 ERROR sync - sync.c:__sync_wait:36 - An error occurred in another process (expected sequence number 5) lxc ml 20201123204717.872 WARN network - network.c:lxc_delete_network_priv:3185 - Failed to rename interface with index 0 from "eth0" to its initial name "vethf9b76b2d" lxc ml 20201123204717.872 ERROR lxccontainer - lxccontainer.c:wait_on_daemonized_start:860 - Received container state "ABORTING" instead of "RUNNING" lxc ml 20201123204717.872 ERROR start - start.c:__lxc_start:1999 - Failed to spawn container "ml" lxc ml 20201123204717.873 WARN start - start.c:lxc_abort:1013 - No such process - Failed to send SIGKILL via pidfd 31 for process 18797 lxc 20201123204718.498 WARN commands - commands.c:lxc_cmd_rsp_recv:126 - Connection reset by peer - Failed to receive response for command "get_state"
Running a security nested and security privileged container works on the machine. Similarly running a nvidia container runtime enabled container works. But when combined both doesn’t work on same container.
Looking forward to a solution on this…