Problem at boot

I have 3 LXC containers on my Debian 11 (lxc version 4.0.6).

When the system boots, the first container (first in list under the /etc/lxc/auto directory) does not start - sometimes. But sometimes it starts as well.

I have set up the log in the config, so I could catch these lines after an unsuccessful start:

lxc vm-mysql ... ERROR    cgroup2_devices - cgroups/cgroup2_devices.c:bpf_program_load_kernel:348 - Operation not permitted - Failed to load bpf program: (null)
lxc vm-mysql ... ERROR    cgroup2_devices - cgroups/cgroup2_devices.c:bpf_program_cgroup_attach:382 - Unknown error -1 - Failed to load bpf program
lxc vm-mysql ... ERROR    cgfsng - cgroups/cgfsng.c:cgfsng_devices_activate:3024 - Cannot allocate memory - Failed to attach bpf program
lxc vm-mysql ... ERROR    start - start.c:lxc_spawn:1834 - Failed to setup cgroup2 device controller limits
lxc vm-mysql ... ERROR    lxccontainer - lxccontainer.c:wait_on_daemonized_start:859 - Received container state "ABORTING" instead of "RUNNING"
lxc vm-mysql ... ERROR    start - start.c:__lxc_start:1999 - Failed to spawn container "vm-mysql"

If I log in after the boot, I can start the VM manually, I can’t reproduce this error.

What does it mean? How can I fix it?

@brauner any idea why the ebpf policy would sometimes fail at boot?

1 Like

The bpf cgroup driver for LXC has been reworked completely and the next point release for 4.0.* will have it included. Though I would think your error could also be caused by having too low of a limit for net.core.bpf_jit_limit. Here’s an excerpt from our production setup:

net.core.bpf_jit_limit	3000000000

This is a limit on the size of eBPF JIT allocations which is usually set to PAGE_SIZE * 40000. When your kernel is compiled with CONFIG_BPF_JIT_ALWAYS_ON=y then /proc/sys/net/core/bpf_jit_enable is set to 1 and can’t be changed. On such kernels the eBPF JIT compiler will treat failure to JIT compile a bpf program such as a seccomp filter as fatal when it would continue on another kernel. On such kernels the limit for eBPF jitted programs needs to be increased siginficantly.

Hi @brauner,

thanks for reply.

Here is my setting:

# sysctl -a | grep net.core.bpf_jit_limit
net.core.bpf_jit_limit = 264241152

If I understand you correctly, I have to increase this value. But then I don’t understand, why starts the container sometimes (with this value)? This value is constant. And why start the other containers (always)? Only the first container fails, and only at boot time.

Thanks again.

A quick note for the limit value:

This is a limit on the size of eBPF JIT allocations which is usually set to PAGE_SIZE * 40000.

# getconf PAGESIZE
# echo $[4096*40000]

so the default value ( 264241152) is higher than the calculated.