My customer installed several updates at once and got cluster that can’t start any containers. The instance startup log looks like:
lxc [instance_name] 20260516213511.413 WARN cgfsng - ../src/lxc/cgroups/cgfsng.c:cgroup_tree_create:550 - File exists - Failed to create monitor cgroup 19(lxc.monitor.infra_dhcp-07-1)
lxc [instance_name] 20260516213511.616 ERROR cgfsng - ../src/lxc/cgroups/cgfsng.c:cgfsng_monitor_enter:1590 - No space left on device - Failed to enter cgroup 33
lxc [instance_name] 20260516213511.617 ERROR start - ../src/lxc/start.c:\__lxc_start:2235 - Failed to enter monitor cgroup
lxc [instance_name] 20260516213511.619 ERROR lxccontainer - ../src/lxc/lxccontainer.c:wait_on_daemonized_start:837 - Received container state “ABORTING” instead of “RUNNING”
lxc [instance_name] 20260516213511.247 WARN cgfsng - ../src/lxc/cgroups/cgfsng.c:cgfsng_payload_destroy:422 - Uninitialized limit cgroup
lxc [instance_name] 20260516213511.266 WARN cgfsng - ../src/lxc/cgroups/cgfsng.c:cgfsng_monitor_destroy:658 - No space left on device - Failed to move monitor 1792355 to “lxc.pivot”
The cluster is build on the RPI4 SBCs with the Ubuntu 24.04 host OS. Cluster uses CEPH storages (microceph cluster running on the same nodes)..
Last updates installed are:
- microceph updated to 19.2.3 (stable)
- incus 7.0
- latest Ubuntu updates.
Host filesystems and CEPH RDBs and FS have a lot of free space.
I have no idea what is causing container starts failures. I would appreciate any suggestions that help me resolving this issue.