Cgroup2 CPU limit no longer working after LXC container upgraded to Debian 12

Noticed a problem when a LXC container is upgraded to Debian 12 (Bookworm): The CPU limits, set by lxc.cgroup2.cpuset.cpus in the container’s config file, are no longer working. All physical cores are now seen inside the LXC container.
Memory limits still work though.

Host: Debian 11 (Bullseye), LXC 4.0.6 (package from Debian repos)
Guest: Debian 12 (Bookworm)

Relevant part of the container’s config file:

# Limits
lxc.cgroup2.cpuset.cpus = 12-13
lxc.cgroup2.cpu.weight = 100
lxc.cgroup2.memory.max = 10G
lxc.cgroup2.memory.high = 10G

Any hints are appreciated, thanks.

Would be interesting to see if the issue is one of the limits not being applied or them not being properly rendered.

You can check the current values through /sys/fs/croup inside the container.

As this happened on a production machine, I had to quickly revert to backup and keep the container on Bullseye. CPU limits work fine there.

However in my test lab I can also reproduce the same behaviour on an older (Debian 10 Buster) LXC Host with LXC 3.0.3 (installed through Debian repos).

root@lab ~ # grep cgroup /var/lib/lxc/bookworm/config
/var/lib/lxc/bookworm/config:lxc.cgroup.cpuset.cpus = 6-7
/var/lib/lxc/bookworm/config:lxc.cgroup.cpu.shares = 1024
/var/lib/lxc/bookworm/config:lxc.cgroup.memory.limit_in_bytes = 4G

After starting the bookworm container, memory limit (4G) is correctly working:

root@bookworm:~# free
               total        used        free      shared  buff/cache   available
Mem:         4194304      237008     3720092        3036      237204     3957296
Swap:        7811068           0     7811068

root@bookworm:~# cat /sys/fs/cgroup/memory/memory.limit_in_bytes 
4294967296

But CPU limit is “ignored”:

root@bookworm:~# grep cores /proc/cpuinfo 
cpu cores	: 8
cpu cores	: 8

However the cgroup limit inside the container seems to be correct:

root@bookworm:~# cat /sys/fs/cgroup/cpuset/cpuset.cpus
6-7

Hmmmmm :thinking:

Differences in /sys/fs/cgroup/:

Debian 11:

root@bullseye:~# ll /sys/fs/cgroup/
total 0
-r--r--r--  1 root root 0 Nov 20 17:00 cgroup.controllers
-r--r--r--  1 root root 0 Nov 20 17:00 cgroup.events
-rw-r--r--  1 root root 0 Nov 20 17:00 cgroup.freeze
-rw-r--r--  1 root root 0 Nov 20 17:00 cgroup.max.depth
-rw-r--r--  1 root root 0 Nov 20 17:00 cgroup.max.descendants
-rw-r--r--  1 root root 0 Nov 20 16:59 cgroup.procs
-r--r--r--  1 root root 0 Nov 20 17:00 cgroup.stat
-rw-r--r--  1 root root 0 Nov 20 17:00 cgroup.subtree_control
-rw-r--r--  1 root root 0 Nov 20 17:00 cgroup.threads
-rw-r--r--  1 root root 0 Nov 20 17:00 cgroup.type
-rw-r--r--  1 root root 0 Nov 20 17:00 cpu.pressure
-r--r--r--  1 root root 0 Nov 20 17:00 cpu.stat
drwxr-xr-x  2 root root 0 Oct  5 21:44 dev-mqueue.mount
drwxr-xr-x  2 root root 0 Nov 20 17:00 init.scope
-rw-r--r--  1 root root 0 Nov 20 17:00 io.pressure
-rw-r--r--  1 root root 0 Nov 20 17:00 memory.pressure
drwxr-xr-x 20 root root 0 Nov 20 16:39 system.slice
drwxr-xr-x  2 root root 0 Oct  5 21:44 user.slice

Bookworm:

root@bookworm:~# ll /sys/fs/cgroup/
total 0
drwxr-xr-x 2 root root  0 Nov 20 16:48 blkio
lrwxrwxrwx 1 root root 11 Nov 20 16:48 cpu -> cpu,cpuacct
lrwxrwxrwx 1 root root 11 Nov 20 16:48 cpuacct -> cpu,cpuacct
drwxr-xr-x 2 root root  0 Nov 20 16:48 cpu,cpuacct
drwxr-xr-x 2 root root  0 Nov 20 16:48 cpuset
drwxr-xr-x 6 root root  0 Nov 20 16:48 devices
drwxr-xr-x 2 root root  0 Nov 20 16:48 freezer
drwxr-xr-x 2 root root  0 Nov 20 16:48 hugetlb
drwxr-xr-x 6 root root  0 Nov 20 16:48 memory
lrwxrwxrwx 1 root root 16 Nov 20 16:48 net_cls -> net_cls,net_prio
drwxr-xr-x 2 root root  0 Nov 20 16:48 net_cls,net_prio
lrwxrwxrwx 1 root root 16 Nov 20 16:48 net_prio -> net_cls,net_prio
drwxr-xr-x 2 root root  0 Nov 20 16:48 perf_event
drwxr-xr-x 6 root root  0 Nov 20 16:48 pids
drwxr-xr-x 2 root root  0 Nov 20 16:48 rdma
drwxr-xr-x 6 root root  0 Nov 20 16:48 systemd
drwxr-xr-x 6 root root  0 Nov 20 16:48 unified

In my LAB environment I assume this could be related to the old Buster Host, which still has cgroupv1. But it’s interesting that the same behaviour with the CPU limit shows up as on a Bullseye Host.

How many processor entries do you get in /proc/cpuinfo?

Your output above suggests that you’re getting two processors (as would be expected on a working system) which would be correct behavior.

LXCFS which renders /proc/cpuinfo in containers cannot alter the content of a CPU definition, so you’ll still see the physical characteristics of your CPU like the fact that a socket has 8 cores show through, but the number of processor entries should reflect the cgroup limit (and is ultimately what software parsing /proc/cpuinfo will rely on).

I see two processors - same on Bullseye and Bookworm container.

You know what… could this be a bug of htop? The reason why I came up with this topic was because I saw the full number of physical cores in htop output.

CPUs 0 and 1 are used. All the others are not moving at all. Which would indicate that there are indeed only two cpus and the limit is correctly applied. BUT htop (falsely) shows all cores.

htop on Bullseye:

It’s certainly possible that htop is using another data source for the number of CPUs.

There are around 4 different ways to get the CPU count and LXCFS can only really cover two of those (/proc and /sys text files). If a piece of software relies on syscalls like sysinfo or getaffinity, then there’s nothing that LXCFS can do about it.

For sysinfo, LXD and Incus have a mechanism to intercept that system call and return the correct value, but doing that with pure LXC is overly complex.

1 Like

Definitely a htop problem. Bookworm comes with htop 3.2.2. Compiled htop 3.2.1, launched and it shows 2 CPUs (the correct number defined by the cgroup limit) in the output.
Details here: htop on Debian 12 (Bookworm) showing all physical CPU cores - even with cgroup limits set

1 Like

What a pleasant experience reading your diagnostic steps, Claudio. Excellent discovery!

Thanks for the detailed article and the subsequent opened issue.

1 Like