Unprivileged container does not work in Ubuntu 22.04

You could try this:

Hi Thomas, thanks for your reply!

That post does look similar, I also upgraded Ubuntu from 20.04 to 22.04.

I reckon as well going back to cgroupv1/hybrid might work, but I want it to make it work in v2. I have seen this workaround in different places, but I have not seen an explanation on “what’s wrong with the cgroup v2 setup”.

Update

My local build can set “+memory +pids” to subtree_control, but it still failed to mount cgroup.

And I do see “supports mount api” with my local build.

Original message

More context: could it be related to mount api?

I’m trying to build lxc from source, it doesn’t fully work, but it seems to go pass “mounting cgroup” step.

According to the error log, there was an error from here, which uses FSCONFIG_CMD_CREATE. I also saw “kernel supports mount api” in TRACE log.

However, I saw FSCONFIG_CMD_CREATE is not supported during meson setup, and I don’t see “kernel supports mount api” in the log.

Any thoughts @stgraber @brauner ?

Sorry I updated my previous message several times.

Now I think it is not a false alarm. I was confused because I messed up with PATH.

I tried running my local build with and without systemd-run, no matter the case, I saw the following in the log.

TRACE    cgfsng - ../src/lxc/cgroups/cgfsng.c:__cgfsng_delegate_controllers:3336 - Enabled "+memory +pids" controllers in the unified cgroup 11

Maybe related: I’ve been using sudo machinectl shell my-dev-user@ to enter a dev shell, then I’d run my local build in that shell, with or without systemd-run

I was able to narrow down a bit:

  • distro-installed lxc-start shows error about “add +memory +pids to subtree_control”, then another error about mounting cgroup
  • my local built lxc-start shows only the mounting error
  • mount api is not relevant, nothing changed if I force enable/disable it.
  • I can see lxc.init.cmd running if I set an empty lxc.mount.auto
    • but of course it’d complain about missing directories.
    • lxc.init.cmd=/bin/bash worked without mounting cgroup.

So I think there’s something about "no permission to mount cgroup (v2) directories).

Relevant question (since I’m not very familiar with cgroup):

How does “mounting cgroup v2 to child process/namespace” work? Can I do it without root? Is there a sample command that I can try?

@tomp I have a thoery and a hacky solution.

I think the issue is indeed about “mounting cgroup2 in rootless containers”. I was able to get a shell by:

In the shell I tried mount -t cgroup2 none somewhere, but I always got error “cannot mount … read only”. I’m not sure about the root cause though.

As for the hack, I found this bug and this PR relevant. Both are for runc.

I followed the same idea and apply the bind-mount in __cgroupfs_mount, which worked.

I’m not sure whether this would be a proper fix though.

Also a note, systemd automatically creates directories in /sys/fs/cgroup, I also needed to grant permissions to the container root, otherwise bind-mount would fail.

Interestingly, it seems that “setting permissions of cgroups directories” fixes the whole thing, I no longer need the bind-mount hack.

For example, the cgroup directory looks like this:

/sys/fs/cgroup/user.slice/user-1006.slice/user@1006.service/app.slice/lxc-my-container-0.scope

In my case I need to

  • manually call chmod o+x app.slice
  • in cgfsng.c apply o+x to lxc-my-container-0.scope, because this part is dynamically generated.

Then lxc-start just works, I don’t need any other hacks.

So does it sound like a proper fix or a hack? Any security concerns?

I’m not sure im afraid.

I find a potential cultprit: I have umask set to 0027 via pam_umask. Everything seems to work if I remove it.

Meanwhile I’m also looking for a better solution without disabling pam_umask.

I think this could be a potential explanation of the post that you mentioned.

Looking ahead, I think it’d be great we have at least one of:

  1. LXC checks that the container root has access to all cgroup directories, just like LXC checks the setuid bit.
  2. LXC shows a hint upon cgroup mounting errors.
  3. Maybe mention this in some relevant wiki/manual.

Otherwise this issue could be very cryptic.

With “umask” I was able to find a couple of existing issues. E.g. #2277 and #3100.

Apparently this also happened for cgroup v1, and it was “fixed” in pam-cgfs.

Now that cgroup v2 is handled by systemd (if I understood correctly), I wonder if this is a bug of LXC, or systemd, or maybe not a bug in the first place?

Are you launching your unprivileged container as the host root user or as an unprivileged user?

unprivileged user

If you have a reproducer then please could you log the issue over at Issues · lxc/lxc · GitHub

Thanks

Sure, will do. :ok_hand:

Done. #4186

1 Like

@tomp

You said:

It was a CGROUP2 bug/problem

Was there an actual BugID for this?

I think Thomas did not say that. It was from the original post.

Actually, was it your own post? :grinning:

I think you opened one didn’t you? Could you link to it here?