Convenient fs sharing with a container, but without ID hole punching

istankovic · April 3, 2024, 10:49pm

I’m wondering, is it possible to have both of the following two things at the same time?

expose a dir owned by user foo from host to a container, such that a random container user bar is able to read/write to it and that the UID/GID map back to foo on the host, so both foo on the host and bar within the container can seamlessly work on the same dir
be secure in the sense that, if a container escape happens (whether from container’s root or bar), the escaped process will not be foo, but rather something like nobody

stgraber · April 4, 2024, 1:27am

There are two ways to do this, one that’s possible today and one that may be possible to implement but would require recent kernels and extra code in Incus.

For the way that exists today, you’d add a disk device to share the folder you want from the host to the container, then make sure the shifted property is set to true. If your user in the container doesn’t happen to have the same uid as the user on the host, then you’ll need to use filesystem ACLs on the host to grant access to that user’s uid on the shared folder.

For the way that doesn’t exist yet, it should be possible to use the new VFS idmap feature with a 3rd party map, basically using a uid/gid map which translates the uid/gid needed for that shared mount so that the host user ends up matching the container user inside the container.
That’s very similar to the raw.idmap trick except that it would only apply to storage and not to process ownership.

istankovic · April 4, 2024, 10:12am

Just to see if I understand. If I use shift=true, and id(foo) == id(bar), wouldn’t that mean that in case of a container escape, the process would essentially be foo? I admit I don’t fully understand the mechanism behind shift (the docs do not really go into details of what happens).

From what you said above regarding the ACLs, it seems to me that using ACLs alone should work, without the need for shift=true, right? Something like

make sure id(foo) != id(bar)
configure ACL on the host folder so that bar can access it within the container
add a disk device

Would that work?

That sounds like pretty much what I’m after, and definitely looks more convenient than messing with ACLs. Do you maybe have a guess on when it might be available?

stgraber · April 4, 2024, 2:47pm

No, only the filesystem is shifted, the process in the container will still be running on a different user.

Basically say you have user foo on the host with uid 1000, then user bar in the container with uid 1000 in the container. That user is something like uid 1001000 on the host.

The shift property makes it so writes by global uid 1001000 is converted to 1000 on the filesystem.

stgraber · April 4, 2024, 2:48pm

Yes, that will work, the main downside is that UIDs that don’t map to the container will be shown as nobody/nogroup, which in the case of ACLs can be rather confusing as you may have several nobody/nogroup in the same ACL (for different uid/gid on the host).

istankovic · April 4, 2024, 5:03pm

Ah, that makes it much clearer, thanks.

istankovic · April 4, 2024, 5:07pm

Right, though in my case I’m dealing with only one uid/gid so it should be fine. Thanks for clarifying.