I ran into an issue with permissions while adding a character device to a running container.
The device is configured in a profile. When I add the profile to a running container, accessing the device only works partially. Depending on how I invoke a command that uses the device file, either the command can access the device, or opening the device file results in error EPERM.
For example lxc exec <container> -- sudo -u <user> <command> works, while ssh <user>@<container-ip> -- <command> results in EPERM.
If I add the profile before starting the container (or restart the container after adding it), accessing the device always works as expected.
Am I expected to add the device before starting the container, or did I miss something in the configuration? Any hints are highly appreciated.
Can you give a more complete reproducer as well as tell us what version of LXD you’re using?
Adding it or changing it in the profile should work fine, but we may have an issue with updating cgroup configuration causing the EPERM, more details would be appreciated.
The device in question is /dev/EtherCAT0. It is created by this kernel module. So reproducibility is somewhat limited.
I tried to reproduce it with a cloned /dev/zero (mknod /dev/foo c 1 5) with the same access rights as the device in question, but that worked as expected with no issue.
I think the problem is that the devices cgroup does only propagate “deny” changes down the hierarchy, but not allow (and even then only making sure you never have more permissions lower in the hierarchy). Since the container is already running, there is {system,user}.slice that both miss the permission for the device.
I assume that lxc exec runs inside the containers root cgroup while ssh runs inside one of the cgroups that do not have this permission. Durign the reboot case, the containers root cgroup will have the device permission set from the start, all groups created later will inherit the entries.
I just check devices.list in /sys/fs/cgroup/devices/ and /sys/fs/cgroup/devices/{system,user}.slice/. And indeed the device is listed in /sys/fs/cgroup/devices/devices.list but not in {system,user}.slice/devices.list. But it is listed after restarting the container.
Would that confirm @eraserix explanation?
Yeah, it would… this lack of propagation is a bit annoying…
It’s always kinda hard to know what the user actually expects in those sceneraios, what you want here (propagation to all child cgroups) may well be considered a security issue by others…
@brauner anything we can do to make this suck less?
This is the cgroup inside the container? Not really I’d say from LXC’s perspective. If it doesn’t propagate to already created device cgroups than that’s likely a kernel thing. But deny settings do propagate? Hm, that sounds intentional such that you don’t accidently propagate additional device permissions per default to unprivileged users.
device cgroups maintain hierarchy by making sure a cgroup never has more
access permissions than its parent. Every time an entry is written to
a cgroup’s devices.deny file, all its children will have that entry removed
from their whitelist and all the locally set whitelist entries will be
re-evaluated.