Few questions regarding security.idmap.isolated and security.nesting

Hello, I’ve been using LXD for a few years now, mainly unprivileged containers.
I have been looking to use some unprivileged nested containers.
At the moment I have security.idmap.isolated: “true” set in the default profile for all of my containers. At /etc/subuid and /etc/subgid I have root:65536:9934465.

The problem is, I can not seem to get the nested containers to get different UIDs they all seem to share the parents containers UID.

Also when running lxd init for the first time in the nested container, I get the prompt of “Would you like to have your containers share their parent’s allocation?”
If I choose yes, the container seems to launch, but it does not appear that security.idmap.isolated true is taking an effect. This is also set in the default profile for the nested container. If I choose No, the containers refuse to even start. I get the error
“Error: Failed instance creation: Failed creating instance record: Failed initialising instance: Invalid config: No uid/gid allocation configured. In this mode, only privileged containers are supported”

I would not like to run privileged containers if I don’t have to. I am at a bit of a loss when trying to understanding this whole uid/gid mapping stuff.

Basically I am just trying to completely isolate the containers from eachother even when nesting. I’ve looked at various guides but none of them are using isolated idmap and are greatly confusing me.

I could really use some help with this. I tried using the tool “uidmapviz” but it does not want to compile at all.

Any help would be greatly appreciated.

security.idmap.isolated on your host means that each container will get a distinct 65536 (default) set of uid and gid.

If you want to run nested containers inside of those containers, you’ll need to allocate far more than 65536 uid/gid on those containers. You can do that by setting security.idmap.size to a larger value on those containers that will host nested containers.

Thank you for the response, I forgot to mention it, but I have also tried what you have suggested. I went with security.idmap.size: “655360”. Should I be choosing yes or no when prompted about the parent container mapping?

You should say no in that case.

I have indeed tested using 'No" for the parent allocation. So basically I have created a container called incus.

The relevant things it has set are
security.idmap.size: “655360”
security.nesting: “true”
security.idmap.isolated: “true” (set at the default profile)

The container starts just fine, and when proceeding to run incus admin init
I am choosing 'No" for the parent allocation prompt. Everything else is default.

Then I go to run this command inside of the incus container.
incus launch images:ubuntu/22.04 u1 -c security.idmap.isolated=true


Creating u1
Starting u1
Error: Failed to run: /usr/sbin/incusd forkstart u1 /var/lib/incus/containers /var/log/incus/u1/lxc.conf: exit status 1
Try incus info --show-log local:u1 for more info


incus info --show-log local:u1
Name: u1
Status: STOPPED
Type: container
Architecture: x86_64
Created: 2023/10/20 00:09 UTC
Last Used: 2023/10/20 00:09 UTC

Log:

lxc u1 20231020000928.941 ERROR conf - …/lxc-5.0.3/src/lxc/conf.c:lxc_map_ids:3701 - newuidmap failed to write mapping “”: newuidmap 6359 0 165536 65536
lxc u1 20231020000928.941 ERROR start - …/lxc-5.0.3/src/lxc/start.c:lxc_spawn:1788 - Failed to set up id mapping.
lxc u1 20231020000928.941 ERROR lxccontainer - …/lxc-5.0.3/src/lxc/lxccontainer.c:wait_on_daemonized_start:879 - Received container state “ABORTING” instead of “RUNNING”
lxc u1 20231020000928.942 ERROR start - …/lxc-5.0.3/src/lxc/start.c:__lxc_start:2107 - Failed to spawn container “u1”
lxc u1 20231020000928.942 WARN start - …/lxc-5.0.3/src/lxc/start.c:lxc_abort:1037 - No such process - Failed to send SIGKILL via pidfd 47 for process 6359
lxc 20231020000928.964 ERROR af_unix - …/lxc-5.0.3/src/lxc/af_unix.c:lxc_abstract_unix_recv_fds_iov:218 - Connection reset by peer - Failed to receive response
lxc 20231020000928.964 ERROR commands - …/lxc-5.0.3/src/lxc/commands.c:lxc_cmd_rsp_recv_fds:128 - Failed to receive file descriptors for command “get_init_pid”

Can you show cat /proc/self/uid_map and cat /proc/self/gid_map inside of the container?

Also, just checking, does that container have a /etc/subuid or /etc/subgid file (if those exist, Incus will respect them which can cause issues).

incus ~ # cat /proc/self/uid_map
0 3866624 655360
incus ~ # cat /proc/self/gid_map
0 3866624 655360

As for /etc/subuid and /etc/subgid
at the moment they do not exist.
They did exist before, but had no entry in it for root
I deleted them as part of my testing.

Okay, make sure the uidmap package isn’t installed, then do systemctl restart incus and see if that helps.

1 Like

Currently testing this in a gentoo container, there is no package called uidmap installed here. I would be glad to test this in a ubuntu container, but I was not able to get snapd started there.

Okay, then look for the newuidmap and newgidmap binaries, if those exist inside your container, you’ll either need to get rid of them or have to put valid /etc/subuid and /etc/subgid files in place.

Alright, I nuked shadow. the container actually starts now.
I will look into completely removing shadow potentially.
But in the case that I can’t, what would be a good starting point for what to set inside /etc/subuid and /etc/subgid.

Yeah, you can’t really nuke shadow as that provides all the passwd/useradd/usermod/… type commands. In Debian/Ubuntu at least, the uidmap stuff is split into a separate package which makes that easier.

As for what you’d need in /etc/subuid and /etc/subgid, it should be pretty similar to what you did on the host. You have 655360 uid/gid allocated in this case, the first 65536 should be left to the system, so your range would likely look like:

root:65536:589824

I was wondering if I should maybe increase the max range on the outside host, from like root:65536:9934465. to something much larger. Perhaps with root:1000000:1000000000
I plan to migrate to incus soon here. So may as well get started with a good range.
So far it appears all the uids are different now between multiple nested containers. It looks like its working!

Good to hear that it worked!

As for the host config, our default is:

root:1000000:1000000000

That’s what Incus would restrict itself to on systems that don’t have a uidmap/gidmap configuration in place.

Ah yes, this is what I had thought as well, which is why I ended up just nuking the files at /etc. Thank you so much for the help, I have been scratching my head about this for the longest time. You have made me less confused about is going on here. I feel confident that I can move forward now. Thanks so much for the work you’ve done on LXD and now Incus!
I could not even use any linux distrubtion and stay sane if not for you. Thank you!

Thank you for this, much appreciated. :bowing_man: I was running into Invalid config: No uid/gid allocation configured. In this mode, only privileged containers are supported errors when trying to run nested Incus containers, and wiping these files + restarting incus seems to have helped.