I have 3 LXC containers on my Debian 11 (lxc version 4.0.6).
When the system boots, the first container (first in list under the /etc/lxc/auto directory) does not start - sometimes. But sometimes it starts as well.
I have set up the log in the config, so I could catch these lines after an unsuccessful start:
lxc vm-mysql ... ERROR cgroup2_devices - cgroups/cgroup2_devices.c:bpf_program_load_kernel:348 - Operation not permitted - Failed to load bpf program: (null)
lxc vm-mysql ... ERROR cgroup2_devices - cgroups/cgroup2_devices.c:bpf_program_cgroup_attach:382 - Unknown error -1 - Failed to load bpf program
lxc vm-mysql ... ERROR cgfsng - cgroups/cgfsng.c:cgfsng_devices_activate:3024 - Cannot allocate memory - Failed to attach bpf program
lxc vm-mysql ... ERROR start - start.c:lxc_spawn:1834 - Failed to setup cgroup2 device controller limits
lxc vm-mysql ... ERROR lxccontainer - lxccontainer.c:wait_on_daemonized_start:859 - Received container state "ABORTING" instead of "RUNNING"
lxc vm-mysql ... ERROR start - start.c:__lxc_start:1999 - Failed to spawn container "vm-mysql"
If I log in after the boot, I can start the VM manually, I can’t reproduce this error.
The bpf cgroup driver for LXC has been reworked completely and the next point release for 4.0.* will have it included. Though I would think your error could also be caused by having too low of a limit for net.core.bpf_jit_limit. Here’s an excerpt from our production setup:
net.core.bpf_jit_limit 3000000000
This is a limit on the size of eBPF JIT allocations which is usually set to PAGE_SIZE * 40000. When your kernel is compiled with CONFIG_BPF_JIT_ALWAYS_ON=y then /proc/sys/net/core/bpf_jit_enable is set to 1 and can’t be changed. On such kernels the eBPF JIT compiler will treat failure to JIT compile a bpf program such as a seccomp filter as fatal when it would continue on another kernel. On such kernels the limit for eBPF jitted programs needs to be increased siginficantly.
# sysctl -a | grep net.core.bpf_jit_limit
net.core.bpf_jit_limit = 264241152
If I understand you correctly, I have to increase this value. But then I don’t understand, why starts the container sometimes (with this value)? This value is constant. And why start the other containers (always)? Only the first container fails, and only at boot time.