Hi all,
I was hoping for some thoughts/comments based on my recent experience with LXC (not LXD or Incus).
I worked with LXC/LXD/Incus on various Ubuntu versions and Proxmox, and then decided to apply some of the experience to a Rocky Linux 9 host. I created a LXC image from a VM (for a variety of reasons I need a RHEL5 image), and I couldn’t get it to run on RL, it keeps failing with this error:
lxc-start centos5-base 20240924185128.959 NOTICE start - ../src/lxc/start.c:start:2194 - Exec'ing "/sbin/init"
lxc-start centos5-base 20240924185128.959 ERROR start - ../src/lxc/start.c:start:2197 - No such file or directory - Failed to exec "/sbin/init"
I first thought this was due to the image and the way I got it, so I decided to move the image to one of my Ubuntu 24.04 hosts (where I have running/working LXC images). To my surprise, it worked fine there.
After lots of browsing I got no closer to explaining this phenomenon, so I decided to ditch RL and try the same image on a fresh Ubuntu 24.04 host, but (also somewhat to my surprise) it failed on this Ubuntu host with the exact same error.
I can share the logs if anybody is interested in reading lines and lines of output, but I think the fundamental difference between the “working” and “non-working” hosts is expressed in this output in the log:
On the “working” host:
lxc-start centos5-base 20240924010944.124 INFO cgfsng - ../src/lxc/cgroups/cgfsng.c:unified_hierarchy_delegated:3467 - Permission denied - The cgroup.threads file is not writable, skipping unified hierarchy
On the “non-working” hosts:
lxc-start centos5-base 20240924183451.939 INFO cgfsng - ../src/lxc/cgroups/cgfsng.c:cgfsng_monitor_create:1391 - The monitor process uses "lxc.monitor.centos5-base" as cgroup
lxc-start centos5-base 20240924183451.939 ERROR cgfsng - ../src/lxc/cgroups/cgfsng.c:__cgfsng_delegate_controllers:3341 - Device or resource busy - Could not enable "+memory +pids" controllers in the unified cgroup 13
lxc-start centos5-base 20240924183451.947 INFO cgfsng - ../src/lxc/cgroups/cgfsng.c:cgfsng_payload_create:1499 - The container process uses "lxc.payload.centos5-base" as inner and "lxc.payload.centos5-base" as limit cgroup
My conclusion - for some reason it works on the host where the unified hierarchy is not enabled/detected/used (per the permission error). Which makes somewhat sense since older kernels don’t work (right) on “pure” cgroup2 machines (at least that’s what I gathered from various other/older posts). I have a hard time keeping all the terminology straight in my head so that may not be correct in any way though.
What I want to ask about and have comments on is:
- What can I look for on my “working” machine that would cause the “Permission denied” error above? My browsing to date hasn’t given me many clues to go by.
- What tricks are there to ascertain if a host is running cgroup1, cgroup2, or some combination of the two? I assume my RL and “fresh” Ubuntu 24 hosts are cgroup2, and my “working” host is either cgroup1 or some combination of the two, I just don’t know how to tell the difference?
Any thoughts on how to assess the state of these hosts with respect to the “cgroup-ness” would be appreciated.
Thanks!