I had to add the container=lxc environment variable to fix an issue with debootstrap which had the error:
mknod: /deb-test/target/test-dev-null: Operation not permitted
E: Cannot install into target '/deb-test/target' mounted with noexec or nodev
Adding that environment variable seems to fix that issue, but then when I run distrobuilder, I get this error:
Error: Failed to setup chroot: Failed to populate /dev: Failed to create "/dev/console": operation not permitted
This seems like a similar problem as what debootstrap was complaining about. Is there a way to work around this? Is it possible to use distrobuilder inside an lxd container?
Thanks for the information. It seems from what you say, that I’ll be unable to activate these features in this environment.
Follow up question: would you say that it is in principle impossible to build a container image without these features? Or are there potential workarounds? I’m surprised that the namespace features don’t allow for this, since I assume that producing a container image does not in any way need to “effect” the host system. So why does it apparently need dangerous powers to build an innocent image?
Image building usually heavily relies on three things:
chroot (with separate /dev, /proc and /sys mounts)
creation of arbitrary device nodes (to populate /dev mostly)
mounting of disk images (iso, img, …)
Then when building VM images, you can add creation of disk images and partition tables to that list.
All of the above are potentially tricky in unprivileged containers:
Allowing mounting new copies of /proc or /sys allows for bypassing all apparmor path confinement.
Creating of arbitrary device nodes would allow even unprivileged containers to write to any device, including the host system’s disks and partitions.
Mounting of block devices or loop devices isn’t namespaced and requires full root access, this would also allow feeding arbitrary data to the kernel, potentially exploiting bugs in it and getting full root access to the host.
We have been working on some solutions to those, but it’s a pretty slow process with a very narrow set of users making it less of a priority:
The syscall interception combined with mknod emulation allows for “safe” device nodes to be created inside of unprivileged containers. This is sometimes sufficient to handle the mknod part.
Unprivileged containers generally don’t rely on apparmor for escape security so rules around /proc and /sys mounting can be relaxed (for now we do this if security.nesting is enabled).
Mounting of block devices is always a tricky one. We have mount interception which combined with redirection to FUSE can make for safe mounting of some filesystems at least. This however isn’t currently able to handle loop devices. For that, a separate initiative called loopfs was put together some years ago but hasn’t seen widespread tracking in the kernel community.
Thanks for the thorough and informative answer, I really appreciate it!
From your explanation, it sounds like syscall interception means it might be possible at some point to use distrobuilder to make a container image within an unprivileged container (with interception enabled).
I hit this today, also on a Chromebook. There isn’t currently an LXC image for the new AlmaLinux distro, so I spun up a CentOS 8 container and attempted to convert it to Alma using their conversion script: almalinux-deploy.sh. It failed on reinstalling the filesystem rpm, with the above “Invalid config” error. Thanks for the detailed explanation as to why this isn’t currently possible. I will just have to be patient