Hi Team!
I have an 8 GPU server out of which I want to divide 4 GPUs into 2 separate LXD containers - Container A and Container B.
I also have security nesting and privilege mode set on both as true.
So far so good, as I can see the GPUs added in specific containers.
Now, I want to run kubernetes in Container A. I tried multiple guides online and fixed up on K3D as a simple setup for scheduling jobs on multiple GPUs in that container.
The issue I am facing is this:
In order to run K3D with docker containers in the LXD container, I need to add the following in apparmor profile:
raw.lxc: |-
lxc.cgroup.devices.allow = a
lxc.cap.drop =
After that I rebooted and k3d cluster started working. But the issue then became:
All the 8 GPUs became visible in the Container A. But my use case required only 4 GPUs out of 8.
How can I restrict to the required number of GPUs while adding enough permissions for the k3d cluster to work properly?
Please suggest what should I do. Appreciate a response on this. Thanks