Cannot start unprivileged container on Debian 11

The example described in /usr/share/doc/lxc/README.Debian.gz from Debian 11 under "Unprivileged containers" gives Failed to mount "proc" with an AppArmor error in dmesg (even though the configuration has unconfined).

$  cat test_config 
  # My subuids are 100000:65536
  lxc.idmap = u 0 100000 65536
  lxc.idmap = g 0 100000 65536
  lxc.mount.auto = proc:mixed sys:ro cgroup:mixed
  lxc.apparmor.profile = unconfined

$   systemd-run --scope --quiet --user --property=Delegate=yes    lxc-start --logfile /dev/stderr -f test_config -n machine
lxc-start machine 20210830065007.367 ERROR    utils - utils.c:safe_mount:1204 - Permission denied - Failed to mount "proc" onto "/proc"
lxc-start machine 20210830065007.367 ERROR    conf - conf.c:lxc_mount_auto_mounts:681 - Permission denied - Failed to mount "proc" on "/proc" with flags 14
lxc-start machine 20210830065007.367 ERROR    conf - conf.c:lxc_setup:3330 - Failed to setup first automatic mounts
lxc-start machine 20210830065007.367 ERROR    start - start.c:do_start:1218 - Failed to setup container "machine"
[snip]

# dmesg | tail
[snip unrelated]
[ 2127.458104] audit: type=1400 audit(1630306207.363:40): apparmor="DENIED" operation="mount" info="failed flags match" error=-13 profile="/usr/bin/lxc-start" name="/proc/" pid=3286 comm="lxc-start" fstype="proc" srcname="proc" flags="rw, nosuid, nodev, noexec"

This suggests the kernel overmounting protection is kicking in.

To confirm that’s the issue, as root on your system, mount another copy of proc somewhere else, for example:

  • mkdir /dev/.proc
  • mount -t proc proc /dev/.proc

Once that’s done, try starting your container again as a your user.

Thank you for trying to pinpoint the issue. The mount command succeeded without any dmesg or console output, but the container still fails, with unchanged lxc and dmesg outputs, as above.

To rule out machine-specific issues, this was also reproduced on a Debian live DVD.

At some point Debian introduced additional sysctl to restrict user namespaces for unprivileged users, maybe they still do that and that’s what’s getting in the way here?

At some point Debian introduced additional sysctl to restrict user namespaces for unprivileged users, maybe they still do that and that’s what’s getting in the way here?

I have asked the Debian people at https://bugs.debian.org/993391

  1. Are the above configuration file and systemd-run command sufficient to run an unprivileged container? The configuration file is copied verbatim, it is not an extract.

  2. Is an unprvileged container expected to work when using / as root? Or must the container filesystem be made by the non-root user?

Oh, sorry, it looked like it was an extract. So you’re indeed instructing LXC to use your existing rootfs as the unprivileged container’s rootfs.

This isn’t going to work and isn’t something that we support with LXC.

That kind of setup is very problematic as your unprivileged user doesn’t own any of the mount table entries from the existing system and on top of that, you’d get a bunch of conflicts on unix sockets, lock files, … There’s also the issue that your container would not be able to correctly see the ownership of any file on your system’s rootfs (everything would show up as nobody:nogroup) and paths that are normally restricted for only root to access would be completely unreachable from the container.

So the short version is that for unprivileged containers to work, you really need a rootfs which is separate from your system’s and which is fully owned by the uids and gids that you’re assigning to the container.

Thank you very much. Keep up with the good work.