2nd System upgraded from Ubuntu 20.04 w/working LXD to Ubuntu 22.04 - LXD again not working

robe2 · December 11, 2022, 6:32pm

FWIW worth, after upgrading my Ubuntu 20.04 to 22.04, all the servers have the same issue when I try to create a Centos/7

lxc launch images:centos/7 centtie-7

Gives me error:

Error: The image used by this instance requires a CGroupV1 host system

I’ve tried the various suggestions of changing the umask detailed here

github.com/lxc/lxc

[rootless, cgroup v2] lxc-start does not work when umask is 0027

opened 08:48AM - 25 Aug 22 UTC

coolwanglu

# Required information * Distribution: Ubuntu 22.04.1 * The output of …* `lxc-start --version` * 5.0.0~git2209-g5a7b9ce67 (installed from ubuntu repo) * 5.0.0 (built from git master) * `lxc-checkconfig` * [lxc-checkconfig.txt](https://github.com/lxc/lxc/files/9422896/lxc-checkconfig.txt) * `uname -a` * `Linux wl-lab 5.15.0-46-generic #49-Ubuntu SMP Thu Aug 4 18:03:25 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux` * `cat /proc/self/cgroup` * `0::/user.slice/user-1006.slice/session-240.scope` # Issue description After upgrading from Ubuntu 20.04 to Ubuntu 22.04.1, I can no longer start my rootless container. I think it is because I have umask 0027 set via pam_umask, removing it would make it work again. My theory: systemd creates cgroup v2 directories without o+x permission. The contanier root can access (+x permission) the leaf directory, but not some parent directories. Similar to #3669. I posted [details](https://discuss.linuxcontainers.org/t/unprivileged-container-does-not-work-in-ubuntu-22-04/14904?u=lu_wang) to the forum. # Steps to reproduce ## 0 Prepare a rootless container Suppose that: - `host-user` is a non-root user on the host. - `container-root` is the root user inside the container, which is a subuid of `host-user`. - The container is named `my-container`. ## 1 Set umask to 0027 Add `session optional pam_umask.so umask=0027` to `/etc/pam.d/common-session-noninteractive` and `/etc/pam.d/common-session` - Make sure this line will take effect. [Reference](https://man.archlinux.org/man/core/pam/pam_umask.8.en#DESCRIPTION) - In my case, just modifying `common-session-noninteractive` was enough. ## 2 Reboot ## 3 Verify umask - Find the pid of `systemd --user` that is running under `host-user`, say it is 12345. - Run `cat /proc/12345/status | grep -i umask` to verify that the umask is 0027. ## 4 Start the container The comment I used: ``` systemd-run --unit=my-unit --user --scope -p "Delegate=yes" -- \ lxc-start -l DEBUG --logfile=/tmp/lxc.log -L /tmp/lxc1.log -F my-container ``` It should fail and show some errors like this: ``` lxc-start: my-container: cgroups/cgfsng.c: __cgfsng_delegate_controllers: 2953 Device or resource busy - Could not enable "+memory +pids" controllers in the unified cgroup 8 lxc-start: my-container: mount_utils.c: fs_attach: 255 Permission denied - Failed to finalize filesystem context 19 lxc-start: my-container: cgroups/cgfsng.c: __cgroupfs_mount: 1539 Permission denied - Failed to mount cgroup2 filesystem onto 18((null)) lxc-start: my-container: cgroups/cgfsng.c: cgfsng_mount: 1708 Permission denied - Failed to force mount cgroup filesystem in cgroup namespace lxc-start: my-container: conf.c: lxc_mount_auto_mounts: 851 Permission denied - Failed to mount "/sys/fs/cgroup" lxc-start: my-container: conf.c: lxc_setup: 4396 Failed to setup remaining automatic mounts lxc-start: my-container: start.c: do_start: 1275 Failed to setup container "my-container" lxc-start: my-container: sync.c: sync_wait: 34 An error occurred in another process (expected sequence number 4) lxc-start: my-container: start.c: __lxc_start: 2074 Failed to spawn container "my-container" lxc-start: my-container: tools/lxc_start.c: main: 306 The container failed to start lxc-start: my-container: tools/lxc_start.c: main: 311 Additional information can be obtained by setting the --logfile and --logpriority options ``` Note: I only saw `Device or resource busy - Could not enable "+memory +pids" controllers in the unified cgroup 8` with the distro version. I didn't see it with the git master version. ## 5 Check cgroup v2 directories The cgroup directory for my container looks like this: ``` /sys/fs/cgroup/user.slice/user-1006.slice/user@1006.service/app.slice/lxc-my-container-0.scope ``` Verify the `o+x` permission bit, in my case both `app.slice` and `lxc-my-container-0.scope` did not have the bit. (therefore container root cannot access) Note that: - You cgroup directory should look similar, but some parts (e.g. uid, slice, container name) may be slightly different. - The scope would disappear once the container is stopped (or lxc-start failed), checking the slices should be enough. - Or you can [start the container with a shell](https://discuss.linuxcontainers.org/t/unprivileged-container-does-not-work-in-ubuntu-22-04/14904/10?u=lu_wang) and check the cgroup directory on the host. # Other information - The [forum post](https://discuss.linuxcontainers.org/t/unprivileged-container-does-not-work-in-ubuntu-22-04/14904?u=lu_wang) with my debugging process - Related issues/PR: - #3669 - #2277 - #3100 - [Same issue fixed for cgroup v1](https://github.com/lxc/lxc/commit/4b088194d63f5f28ee671e30c7be58e8800c5b63) - [My thoughts](https://discuss.linuxcontainers.org/t/unprivileged-container-does-not-work-in-ubuntu-22-04/14904/13?u=lu_wang)

I had also tried changing the line in my /etc/default/grub

From:

GRUB_CMDLINE_LINUX_DEFAULT="quiet splash"

to

GRUB_CMDLINE_LINUX_DEFAULT="quiet splash systemd.unified_cgroup_hierarchy=false"

and then doing:

sudo grub-mkconfig -o /boot/grub/grub.cfg

and then rebooting the host.

None of those helped alleviate the problem. Am I missing a step somewhere. I have no issue creating this on a Ubuntu 20.04 host. I can create Ubuntu and Debian latest versions of containers fine. I haven’t tried really old versions beside the centos 7 yet.

I’m running lxd version 5.8 on all the servers.

uname -a
on my Arm host is running
5.15.0-1026-aws 2022 aarch64 aarch64 aarch64 GNU/Linux

on one of my x64 hosts having the same issue, I’m running

5.15.0-50-generic 2022 x86_64 x86_64 x86_64 GNU/Linux

Forgot to mention, I also tried:

sudo upgrade-grub

and the output is

Sourcing file /etc/default/grub' Sourcing file /etc/default/grub.d/40-force-partuuid.cfg’
Sourcing file /etc/default/grub.d/50-cloudimg-settings.cfg' Sourcing file /etc/default/grub.d/init-select.cfg’
Generating grub configuration file …
GRUB_FORCE_PARTUUID is set, will attempt initrdless boot
Found linux image: /boot/vmlinuz-5.15.0-1026-aws
Found initrd image: /boot/initrd.img-5.15.0-1026-aws
Found linux image: /boot/vmlinuz-5.15.0-1022-aws
Found initrd image: /boot/initrd.img-5.15.0-1022-aws
Warning: os-prober will not be executed to detect other bootable partitions.
Systems on them will not be added to the GRUB boot configuration.
Check GRUB_DISABLE_OS_PROBER documentation entry.
Adding boot menu entry for UEFI Firmware Settings …
done

The other hosts I have having the issue are all physical machines. This is the only one that is a Virtual machine. They all return similar output