Hi I am upgrading from lxc 3.0.2 to master branch (preparing for lxc 4.0) , please advice how to set system wide cgroup limits using lxc.cgroup.pattern.
System Details:
yocto morty, linux v4.9, cgroup v1, lxc 3.2.1+master-bf04c8508dce0c68ea9a98ee6ac1fa01dc2f56a2
/etc/lxc/lxc.conf based on lxc 3.0.2
// global system wide cgroup limit are applied on mylxc
lxc.cgroup.pattern = mylxc/%n
// l3mdev cgroup does not allow nesting, so it is excluded.
lxc.cgroup.use = blkio,cpu,cpuacct,debug,devices,freezer,net_cls,memory,name=systemd
Issue:
lxc-execute ub – bash
On host I see root lxc.cgroup.pattern is not being honored:
/sys/fs/cgroup/memory/lxc.payload.ub/
I expected it to be /sys/fs/cgroup/memory/mylxc/lxc.payload.ub
cgroups are not visible inside container.
root@ub:/# ls -l /sys/fs/cgroup/
total 0
Default.conf is per container. In lxc man pages there is lxc.system.conf man page that explains lxc.cgroup.pattern. For us to enforce system wide limit on all containers . I.e. all containers combined resources for example don’t consume beyond 4 gb memory then we need a top level cgroup to enforce this limit. I.e. sys/fs/memory/containers/lxc.payload.ub
Here global total limit on all containers will be set on containers/memory.limit_in_bytes . For this we lxc.cgoup.pattern=containers/lxc/%n.
Did you configure lxc.mount.auto to setup a cgroup mount for you?
In most cases we prefer the containers to handle that through their init system as that’s what a normal Linux system does, but there should still be optoins to force liblxc to setup mounts for you I think.
As @stgraber said, we usually leave that for the init system. If your init system doesn’t mount them automatically but you want them to be mounted anyway you need to set:
Not yet, but once we release 4.0 we might add a section explaining this explaining the logic behind that. In short, this was made necessary by the new cgroup filesystem version aka cgroup2. It has a completely different delegation/ownership model then the legacy cgroup filesystem. One of the most daunting - yet understandable - restrictions is that only leaf nodes of a cgroup tree can container live processes. That means you can’t have a hierarchy where the [lxc monitor] process sits above the container’s cgroup and supervises it because that would violate the no-live-processes-in-non-leaf-nodes restriction. It means the [lxc monitor] and the container need to be on the same level in the cgroup tree. This way they can both have live processes in them or even have subtrees. So lxc.monitor.<container-name> is the monitor’s cgroup and lxc.payload.<container-name> is the container’s cgroup.
In any case, while it is tempting and understandable to make assumptions about the on-disk cgroup layout of a container I would strongly recommend against this. We can’t make any promises as to how the on-disk layout needs to look like. That is dictated more by the cgroup filesystem itself then LXC. We really have not choice then to do it this way. The good news is that this flattens the cgroup tree quite a bit which is great because scheduling costs, moving processes around, creating new subcgroups is way more expensive the further you go down a cgroup hierarchy since a global semaphore needs to be taken whenever something like this happens.
lxc.cgroup.pattern patch works as expected and lxc.mount.auto = cgroup:rw:force mounts cgroups but lxc.mount.auto = cgroup:mixed which is inherited by default through /usr/share/lxc/config/common.conf is not able to mount cgroups . Same issue even if i directly set it in containers config. Is this expected. AFAIK this lxc.mount.auto = cgroup:mixed does cgroup mounts in lxc 3.0.2 .
When cgroup:mixed is set and LXC detects that cgroup namespaces are supported (which they are on a 4.9 kernel) it will setup cgroups for the container as requested by the user but it will not mount them and leave this up to the init system of the container. So if your init system doesn’t mount cgroups by default then you won’t have them mounted. Setting the force option is a way to tell LXC to always mount them.