Is it the case that you cannot use 'shift: "true"' for disk devices where the source is a mergerfs mount? Is there a workaround?

Hi there, I hope you’re all doing well this weekend :slight_smile:


Problem

For the first question “Is it the case…“, I searched and found other similar pages for NFS and VFAT (before a certain patch at least), and I couldn’t find a discussion relating to mergerfs.

Is it also the case that it’s just a case it simply doesn’t support VFS idmap features required for using shift: “true”?

If this is case, is there maybe a sensible way to work around it?


Data and Error

My mergerfs mount, working on the host

/mnt/mergerfs_01:/mnt/mergerfs_02  /mnt/mergerfs_combined  mergerfs  rw,noatime,uid=arch,gid=arch,cache.files=off,dropcacheonclose=false,category.create=mfs,func.getattr=newest  0  0

The attempted config addition (under devices). It works without the shift but I get nobody:nobody entries.

mergerfscom:
  path: /mnt/mergerfs_combined
  shift: "true"
  source: /mnt/mergerfs_combined
  type: disk

The error

Config parsing error: Failed to start device “mergerfscom”: Required idmapping abilities not available

Kernel (running Arch Linux)

uname -a
Linux homeMachine 6.17.8-arch1-1 #1 SMP PREEMPT_DYNAMIC Fri, 14 Nov 2025 06:54:20 +0000 x86_64 GNU/Linux

Incus Version: 6.18.0-2.1

Support for VFS idmapped mounts must be done in each respective filesystem. It sounds like mergefs doesn’t yet support it.

As mergefs is FUSE based, it shouldn’t be terribly hard for its developer to add support.

Here is an example of the needed changes for a similar filesystem: main: support idmapped mounts · mihalicyn/fuse-overlayfs@89a1af3 · GitHub

1 Like

Thanks, Stéphane, I’ll have a dive into the code, it does look quite doable as you said.

Have a great weekend!

mergerfs does allow for the setting of FUSE’s IDMAP flag but the problem is AFAIK the kernel FUSE feature is not fully implemented. The current support was added for virtio-fs, not general use. When enabled the kernel no longer includes the requester’s uid and gid which is required to do properly shift credentials as needed. The fuse-overlayfs example you show is not valid. The interface it uses from FUSE doesn’t exist. It was never added to the kernel or libfuse.

Unless the kernel is providing the uid:gid as part of the message mergerfs, and any filesystem, would have to implement their own ways to know what the requester’s identity is.

It’s on the todo list to ask the community when this feature is expected to be flushed out or at least docs made to explain the intended usage.

1 Like

@amikhalitsyn

Hacking around a bit, I’ve got a seemingly functionally correct setup that may be a bit mad.

Passing the constituent paths (/mnt/mergerfs_01 and /mnt/mergerfs_02) with shift: "true" to the container, then using identical mount options as the host in the container’s fstab for the /mnt/mergerfs_combined entry, and it seemingly works :man_shrugging:

On the host’s and the container’s /mnt/mergerfs_combined mounts, I tried some hard-linking, permissions, and user/group ownership tests from both host and container. It looks to be working without error, and results of ls -lahR match up.

I assume this is either classed as undefined behaviour, outright incorrect configuration, bad for performance, or a mixture of these…

That should actually be perfectly fine.

FUSE filesystems can be mounted directly inside of unprivileged containers without any performance penalty or issue. It’s kernel work we’ve done quite a long time ago.

Then VFS idmap of the underlays (shift=true) is a native Linux kernel feature so long as the filesystem supports it (which is clearly the case here). I don’t think there’s any more conversion steps with this setup as there would be with running the FUSE filesystem on the host and even if there were, we’re literally talking simple addition/subtractions to convert integer, so not exactly CPU intensive stuff :slight_smile:

1 Like

Oh, agreed on those fronts, I more meant (taking containers out of the equation) my concoction of two distinct mergerfs mounts, working on identical underlying paths and options, being treated as one canonical path from a user perspective. My setup seems a bit hacky or like an anti-pattern, but maybe I’m mistaken.

I’ll jump out of the thread now, but just want to quickly say thank you so much to you both for your inputs here, and also the fantastic respective projects you’re part of.

All the best,
IdyllicHappiness

1 Like

use Access Control Lists (ACLs) & remove shift=true

┬─[abdodz@Archlinux:~]─[04:58:17 PM]
╰─>$ getfacl shared
file: shared
# owner: abdodz
# group: incus-shared
user::rwx
user:abdodz:rwx
user:1001000:rwx
group::r-x
group:incus-shared:rw-
mask::rwx
other::r-x
default:user::rwx
default:user:abdodz:rwx
default:user:1001000:rwx
default:group::r-x
default:group:incus-shared:rw-
default:mask::rwx
default:other::r-x

┬─[abdodz@Archlinux:~]─[04:58:27 PM]
╰─>$ cat /etc/group |grep incus-shared
incus-shared:x:1001000:abdodz
┬─[abdodz@Archlinux:~]─[05:01:04 PM]
╰─>$ 



``
1 Like

Hello everyone,

From what I see idmap support is enabled in mergerfs https://github.com/trapexit/mergerfs/commit/ca59ae53a5845db875040000be7db6f8dfd8ffb6 and available starting from 2.41.0

1 Like

@trapexit Hi,

mergerfs does allow for the setting of FUSE’s IDMAP flag but the problem is AFAIK the kernel FUSE feature is not fully implemented. The current support was added for virtio-fs, not general use.

Actually, it is fully implemented. The problem is that VFS idmap combined with network-like filesystems (fuse is in that category too) can be very misleading. First remote filesystem which got a support for idmapped mounts was cephfs and we had a ve-e-ery long discussion with cephfs maintainers to agree on the semantics.

Unless the kernel is providing the uid:gid as part of the message mergerfs, and any filesystem, would have to implement their own ways to know what the requester’s identity is.

This was my first approach Making sure you're not a bot! but Miklos (fuse maintainer) had a different vision on this, so I had to rework it and get rid of FUSE_OWNER_UID_GID_EXT extension. You can see all the changelog here Making sure you're not a bot!

It’s on the todo list to ask the community when this feature is expected to be flushed out or at least docs made to explain the intended usage.

The expected fuse filesystem behavior in order to enable idmapped mounts support is to use default_permissions mount option (kernel checks this and only allows idmapped mounts if it is set) and never rely on uid/gid sent from the kernel except the cases where filesystem creates a new inode (mknod, mkdir, symlink, O_CREAT…). We explicitly set uid/gid to -1 to prevent fuse filesystem developer from misunderstanding and making a nasty bug in here ( https://github.com/torvalds/linux/blob/ac3fd01e4c1efce8f2c054cdeb2ddd2fc0fb150d/fs/fuse/dev.c#L240 )

Let me just look into what mergerfs does and why it needs uid/gid anywhere, except inode creation case.

The interface it uses from FUSE doesn’t exist.

yeah, this was an example for one of my Linux kernel patchset iterations. But at the end we didn’t go this way. A correct example to look into is virtiofsd support idmapped mounts (!245) · Merge requests · virtio-fs / virtiofsd · GitLab

1 Like

I don’t see how I can (nicely) support it then. I need to know the caller app’s uid:gid:supgroups to properly mange access and functionally. The permissions is only part of the story. As a union filesystem it needs to be able to change credentials to the caller so the underlying filesystem can handle things as it should. It would be totally inappropriate from a security as well as functional perspective to do activities as root and then change afterwards. And since some filesystems aren’t even POSIX compatible it would break things further to enforce perms as such (though I do by default use default_permissions.) Even then it would require knowledge of the caller’s credentials. Having to check /proc every single call to see the caller’s credentials isn’t going to fly.

1 Like

yeah, I went through mergerfs code. Cool stuff. And yes, I see that in every fuse request handler we have const ugid::Set ugid(ctx_); thing, which implicitly changes current uid/gid to those from fuse request context. When idmapped mounts are enabled, you mostly get -1 in there, which obviously is a problem..

As I said earlier, idmapped mounts for remote-like filesystems are not easy at all. Philosophically speaking, idmapped mounts is a purely in-kernel feature and filesystem shouldn’t even care about it. But we’ve seen with cephfs, for example, that sometimes filesystems are not fully POSIX and they can, for example, do some extra permission checks based on uid/gid (cephfs supports uid/gid and path (!) based access filtering which completely disagree with Linux vfs permissions philosophy). So with idmapped mounts for fuse our first priority was to make the implementation clean and prevent it from being improperly used by filesystems developers.

Sure, I get it. It’s just not strictly about remote-like filesystems. There are lots of non-POSIX filesystems around that do funny stuff. Over the years I probably should have documented them but I recall for instance one allowing chown/chmod to succeed but then not changing anything. Or maybe it was the reverse where it fails but access is allowed anyway. Regardless, I generally try to accommodate for those since a decent number of users leverage non-posix filesystems with mergerfs.

Plausibility I can rework the code to support a mode where it only does that on inode creation operations if idmap is enabled. I’ll need to audit the code and remember why credentials were needed in each location.

1 Like

yep, I understand. I need to get myself familiar with mergerfs, but my first impression here is that we can make idmapped mounts to be an opt-in explicit feature on mergerfs (as I see this is already done). Then if it is enabled, we can slightly change the semantics of mergerfs, for inode-creating operations it continue to do the same const ugid::Set ugid(ctx_); dance, while for any other (basically, read-like operations) it will act as a root or “default user”. Obviously, it will make some workloads to fail, cause as you mentioned, some filesystems are not even POSIX compatible and they might be actually forbidding something for root, while allowing to non-root user or do some other nasty things. But if we make a proper doc explaining this and feature is opt-in, then I think we are on a safe side. WDYT?

yeah, we are thinking the same way. It is a good sign :wink:

But yeah, we should be careful. We definetely don’t want to break any existing mergerfs-based setups just because of idmapped mounts. So it is good that this is an opt-in feature.

1 Like

To be fair I can’t think of another way to handle it besides working around it by trying to replace the missing uid,gid from the fuse_in_header with queries to /proc :smiley:

@IdyllicHappiness I’ll see what I can do. No promises on timelines though as I’m 1) on US holiday this week and 2) battling a mergerfs + nix-build sandbox issue at the moment.

1 Like

Brilliant, thanks, all!

@trapexit No rush or expectations from my end, I hope you enjoy your holiday :slightly_smiling_face: As a slight aside, the setup I mentioned above (2 mergerfs mounts, same mount options and underlying branches), is that asking for trouble?

Thanks, unfortunately my 3am flight was canceled at 12:30am. Hopefully no more cancellations. :grimacing: Edit: after sending this it got delayed… ::sigh::

It should be fine generally. Having multiple pools with the same branches is really about out of band changes. Usage and Functionality - mergerfs

Though creating a pool of pools is not ideal as you’re adding a lot of overhead. What’s the usecase?

Ach, sorry about the flight delay! I’ve memories of getting stuck in Stockholm airport for 14 hours, so I know the pain…

The use case is looking to create a single media directory tree (the mergerfs branches being subvolumes on different BTRFS disk drives), and my initial plan was to create it on the host, and pass it down to the 2 Incus containers that could make use of it*. Then (from a user perspective) any IO would then be done through the mergerfs mounts, never the underlying branches directly.

My possible workaround (in light of the wee road bumps discussed in this thread) was to pass down the constituent branches individually (which can be ID mapped), and then create an identical mergerfs mount inside two containers.

I suppose the talk of containers (at least with regards to feasibility of my workaround) might be a bit a bit of a red herring, though.

I’m guessing any problems I might have with this workaround would be the same as if on the host I had 3 separate but identically configured mounts on the host (e.g. /mnt/mergerfs_comb_01, /mnt/mergerfs_comb_02, and /mnt/mergerfs_comb_03, with same options and all made up of /mnt/mergerfs_branch_01:/mnt/mergerfs_branch_02), as - if I understand you and that bit of documentation correctly - while they’re all mergerfs, they are effectively out-of-band to each other, so turning off caching options may result in better correctness.

All the best with the flight, got my fingers crossed :slight_smile:


* Although now I’m thinking that Incus may effectively be doing what I’m doing manually - a remount with the same sources and options, thus even if ID mapping currently worked I may want to turn off caching options in any case.

1 Like