Cgroup lxcfs emulation

Hi,

I am having an issue with lxcfs in CentOS 7 causing problems shutting down containers.

I’ve tracked the issue to the lxcfs hook script that mounts cgroups into the container in order to emulate them.

The emulation of /proc doesnt seem to cause a problem, but the cgroups emulation does.

However I dont really understand what the cgroups emulation feature actually does/offers, as if I modify the hook script to quit before it sets up the emulated cgroups, then everything seems to work fine.

See https://github.com/lxc/lxcfs/issues/252#issuecomment-419459548

On kernels that don’t support cgroup namespacing, which may be what you’re running here, the cgroup tree that’s emulated by lxcfs is necessary so that systemd will start in the container.

Thanks Stéphane.

The strange thing is that both CentOS 7 and Debian 9 containers (both use systemd) start fine when I disable the cgroup emulation on a CentOS 7 host.

However I can see a difference in the mounted cgroups.

This is what it looks like when cgroup emulation is enabled:

[root@el7build01 ~]# mount | grep cgroup
none on /sys/fs/cgroup type tmpfs (rw,nosuid,nodev,noexec,relatime,size=10240k,mode=755,uid=1258512,gid=1258512)
lxcfs on /sys/fs/cgroup/blkio type fuse.lxcfs (rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other)
lxcfs on /sys/fs/cgroup/cpuacct,cpu type fuse.lxcfs (rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other)
lxcfs on /sys/fs/cgroup/cpuset type fuse.lxcfs (rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other)
lxcfs on /sys/fs/cgroup/devices type fuse.lxcfs (rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other)
lxcfs on /sys/fs/cgroup/freezer type fuse.lxcfs (rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other)
lxcfs on /sys/fs/cgroup/hugetlb type fuse.lxcfs (rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other)
lxcfs on /sys/fs/cgroup/memory type fuse.lxcfs (rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other)
lxcfs on /sys/fs/cgroup/systemd type fuse.lxcfs (rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other)
lxcfs on /sys/fs/cgroup/net_prio,net_cls type fuse.lxcfs (rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other)
lxcfs on /sys/fs/cgroup/perf_event type fuse.lxcfs (rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other)
lxcfs on /sys/fs/cgroup/pids type fuse.lxcfs (rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other)

And this is what it looks like without it:

[root@el7build01 ~]# mount | grep cgroup
none on /sys/fs/cgroup type tmpfs (rw,nosuid,nodev,noexec,relatime,size=10240k,mode=755,uid=1258512,gid=1258512)
none on /sys/fs/cgroup/systemd type tmpfs (ro,nosuid,nodev,noexec,relatime,size=10240k,mode=755,uid=1258512,gid=1258512)
cgroup on /sys/fs/cgroup/systemd/lxc/el7build01 type cgroup (rw,nosuid,nodev,noexec,relatime,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd)
none on /sys/fs/cgroup/pids type tmpfs (ro,nosuid,nodev,noexec,relatime,size=10240k,mode=755,uid=1258512,gid=1258512)
cgroup on /sys/fs/cgroup/pids/lxc/el7build01 type cgroup (rw,nosuid,nodev,noexec,relatime,pids)
none on /sys/fs/cgroup/cpu,cpuacct type tmpfs (ro,nosuid,nodev,noexec,relatime,size=10240k,mode=755,uid=1258512,gid=1258512)
cgroup on /sys/fs/cgroup/cpu,cpuacct/lxc/el7build01 type cgroup (rw,nosuid,nodev,noexec,relatime,cpuacct,cpu)
none on /sys/fs/cgroup/perf_event type tmpfs (ro,nosuid,nodev,noexec,relatime,size=10240k,mode=755,uid=1258512,gid=1258512)
cgroup on /sys/fs/cgroup/perf_event/lxc/el7build01 type cgroup (rw,nosuid,nodev,noexec,relatime,perf_event)
none on /sys/fs/cgroup/net_cls,net_prio type tmpfs (ro,nosuid,nodev,noexec,relatime,size=10240k,mode=755,uid=1258512,gid=1258512)
cgroup on /sys/fs/cgroup/net_cls,net_prio/lxc/el7build01 type cgroup (rw,nosuid,nodev,noexec,relatime,net_prio,net_cls)
none on /sys/fs/cgroup/blkio type tmpfs (ro,nosuid,nodev,noexec,relatime,size=10240k,mode=755,uid=1258512,gid=1258512)
cgroup on /sys/fs/cgroup/blkio/lxc/el7build01 type cgroup (rw,nosuid,nodev,noexec,relatime,blkio)
none on /sys/fs/cgroup/devices type tmpfs (ro,nosuid,nodev,noexec,relatime,size=10240k,mode=755,uid=1258512,gid=1258512)
cgroup on /sys/fs/cgroup/devices/lxc/el7build01 type cgroup (rw,nosuid,nodev,noexec,relatime,devices)
none on /sys/fs/cgroup/cpuset type tmpfs (ro,nosuid,nodev,noexec,relatime,size=10240k,mode=755,uid=1258512,gid=1258512)
cgroup on /sys/fs/cgroup/cpuset/lxc/el7build01 type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset)
none on /sys/fs/cgroup/memory type tmpfs (ro,nosuid,nodev,noexec,relatime,size=10240k,mode=755,uid=1258512,gid=1258512)
cgroup on /sys/fs/cgroup/memory/lxc/el7build01 type cgroup (rw,nosuid,nodev,noexec,relatime,memory)
none on /sys/fs/cgroup/hugetlb type tmpfs (ro,nosuid,nodev,noexec,relatime,size=10240k,mode=755,uid=1258512,gid=1258512)
cgroup on /sys/fs/cgroup/hugetlb/lxc/el7build01 type cgroup (rw,nosuid,nodev,noexec,relatime,hugetlb)
none on /sys/fs/cgroup/freezer type tmpfs (ro,nosuid,nodev,noexec,relatime,size=10240k,mode=755,uid=1258512,gid=1258512)
cgroup on /sys/fs/cgroup/freezer/lxc/el7build01 type cgroup (rw,nosuid,nodev,noexec,relatime,freezer)

Clearly there are differences there, but I can’t see any functional difference (except that it means the containers stop cleanly rather than intermittently hanging).

CentOS 7 does seem to be quite happy to start systemd containers, even debian ones, with lxcfs cgroup emulation disabled.

However the lxcfs cgroup emulation feature is causing stability issues and is frequently crashing.

This seems to be avoided by exiting the lxcfs hook script after the proc mounts have been made, but before the cgroup emulation has been done.

Everything seems to work fine then, and I can’t see what benefit cgroup emulation is giving on a CentOS 7 kernel. Infact the stability issues when using it mean with it enabled lxcfs is effectively unusable.

I thought I would write a follow up to this issue. I have not been able to solve the immediate issue of hangs when shutting down containers, running the official CentOS 7 kernel. However moving to a later kernel with cgroup namespaces has solved the issue for me. I’ve written about my experiences here https://www.tomp.uk/blog/view/40