Incus exec container bash => Permission denied

Hello,

playing around with a TrueNAS scale 24.04 (Debian Bookworm) if I can install and run Incus on it by using their developer mode.
Installation works as expected by following the Zably instructions. Able to launch a new container and it starts as usual:

root       77644       1  0 02:01 ?        00:00:00 [lxc monitor] /var/lib/incus/containers gentoo
1000000    77657   77644  0 02:01 pts/0    00:00:00 init [3]
1000000    78357   77657  0 02:01 ?        00:00:00 dhcpcd: eth0 [ip4] [ip6]
1000000    78634   77657  0 02:01 ?        00:00:00 /sbin/agetty 38400 console linux

Incus ls:

+--------+---------+-----------------------+------+-----------+-----------+
|  NAME  |  STATE  |         IPV4          | IPV6 |   TYPE    | SNAPSHOTS |
+--------+---------+-----------------------+------+-----------+-----------+
| gentoo | RUNNING | 10.108.180.218 (eth0) |      | CONTAINER | 0         |
+--------+---------+-----------------------+------+-----------+-----------+

incus exec gentoo bash:

bash: /root/.bashrc: Permission denied
gentto ~ # 

Issue is as soon as I try to jump on that new container I get the Permission denied error above. Using Gentoo is just one example other distros report exactly the same error. Sometimes they even don’t get an IP address or other services fail to start. Changing the security settings to priviledged = true make the issue going away but that is a bad workaround :wink:

Given that everything is working in priviledged mode it seems like something is missing or need to be configured to allow unpriviledged mode. Kind of have the feeling it has something todo with ID mapping but I’m not quite sure how to debug this further. Hope someone can give me a hint to solve it.

Appreciate any feedback

That’s just a message, an unimportant one. You still get the shell in the container.

Your first choice would be

incus exec gentoo /bin/sh

Thanks,

yes I get a shell and can perform basic stuff but as soon as I try to start any additional services they also report a “permission denied”. It actually starts already during container boot, take a look at the first lines of console.log:

INIT: version 3.09 booting

   OpenRC 0.54 is starting up Gentoo Linux (x86_64) [LXC]

 * Mounting /run ... [ ok ]
 * Caching service dependencies ... [ ok ]
mount: /sys/fs/cgroup: none already mounted on /dev.
       dmesg(1) may have more information after failed mount system call.
/etc/init.d/cgroups: line 92: echo: write error: Device or resource busy
 [ !! ]
 * ERROR: sysctl failed to start
 * Creating user login records ... [ ok ]
 * Wiping /tmp directory ... [ ok ]
 * Bringing up network interface lo ...RTNETLINK answers: File exists
 [ ok ]
 * Updating /etc/mtab ... * Creating mtab symbolic link
 [ ok ]
 * Create Volatile Files and Directories ... [ ok ]

Starts perfectly fine as it should. Now look at the output of the failing instance:

INIT: version 3.09 booting

   OpenRC 0.54 is starting up Gentoo Linux (x86_64) [LXC]

 * Mounting /run ... [ ok ]
 * Caching service dependencies ... [ ok ]
mount: /sys/fs/cgroup: none already mounted on /dev.
       dmesg(1) may have more information after failed mount system call.
/etc/init.d/cgroups: line 92: echo: write error: Device or resource busy
 [ !! ]
 * ERROR: sysctl failed to start
mkdir: cannot create directory ?~@~X/var/lib/misc?~@~Y: Permission denied
 * failed to create needed directory /var/lib/misc
 [ ok ]
 * Updating /etc/mtab ... * /etc is not writable; unable to create /etc/mtab
 [ !! ]
 * Create Volatile Files and Directories ... [ ok ]
INIT: Entering runlevel: 3
 [ !! ]
 * ERROR: sysctl failed to start
mkdir: cannot create directory ?~@~X/var/lib/misc?~@~Y: Permission denied
 * failed to create needed directory /var/lib/misc

Reduced the output to show the differences. The second log contains a lot of more “Permission denied” lines. So the issue starts already during container start and each OS fails at different places.

As mentioned before something needs to be re-configured on the OS. Checked both /etc/subuid and /etc/subgid they contain

root:1000000:1000000000

However, it might be that the host system doesn’t read or ignore them? But why can the container be started using:

root       24045       1  0 22:36 ?        00:00:00 [lxc monitor] /var/lib/incus/containers gentoo
1000000    24084   24045  0 22:36 pts/0    00:00:00 init [3]
1000000    24778   24084  0 22:36 ?        00:00:00 dhcpcd: eth0 [ip4] [ip6]
1000000    25170   24084  0 22:36 ?        00:00:00 /sbin/agetty 38400 console linux

Any other settings to check? May be cgroups or namespace?

You would run here lxc-checkconfig (install LXC), which is a script that shows if something big is missing and does not allow Incus (or LXC) to function properly.
Then, post the output.

As requested here is the output of lxc-checkconfig:

root@truenas[~]# lxc-checkconfig 
LXC version 5.0.2
Kernel configuration not found at /proc/config.gz; searching...
Kernel configuration found at /boot/config-6.6.20-production+truenas

--- Namespaces ---
Namespaces: enabled
Utsname namespace: enabled
Ipc namespace: enabled
Pid namespace: enabled
User namespace: enabled
Network namespace: enabled

--- Control groups ---
Cgroups: enabled
Cgroup namespace: enabled
Cgroup v1 mount points: 
Cgroup v2 mount points: 
 - /sys/fs/cgroup
Cgroup device: enabled
Cgroup sched: enabled
Cgroup cpu account: enabled
Cgroup memory controller: enabled
Cgroup cpuset: enabled

--- Misc ---
Veth pair device: enabled, loaded
Macvlan: enabled, not loaded
Vlan: enabled, loaded
Bridges: enabled, loaded
Advanced netfilter: enabled, loaded
CONFIG_IP_NF_TARGET_MASQUERADE: enabled, not loaded
CONFIG_IP6_NF_TARGET_MASQUERADE: enabled, not loaded
CONFIG_NETFILTER_XT_TARGET_CHECKSUM: enabled, not loaded
CONFIG_NETFILTER_XT_MATCH_COMMENT: enabled, not loaded
FUSE (for use with lxcfs): enabled, not loaded

--- Checkpoint/Restore ---
checkpoint restore: enabled
CONFIG_FHANDLE: enabled
CONFIG_EVENTFD: enabled
CONFIG_EPOLL: enabled
CONFIG_UNIX_DIAG: enabled
CONFIG_INET_DIAG: enabled
CONFIG_PACKET_DIAG: enabled
CONFIG_NETLINK_DIAG: enabled
File capabilities: enabled

Note : Before booting a new kernel, you can check its configuration
usage : CONFIG=/path/to/config /bin/lxc-checkconfig

No errors from what I can see. Meanwhile I installed the TrueNAs RC1 version and there everything is working perfectly as designed. In other words something has changed between RC1 and release version.

Both are basic installs using “DIR” as storage driver which should rule out any issues in this direction.

1 Like

Running a few more tests and comparing the working and non working systems it finally turns out the issue is related to the ZFS kernel module. The working Bookworm system is a standard netinstall using default settings (single boot partition) where as TrueNAS has multiple ZFS partitions. So they are not the same but got me one step further checking deeper on ZFS compatibility. I came across the following forum posts:

ZFS 2.2.0 Released: ID mapping of unprivileged containers during mount
Migrating LXD 5.20 → Incus 0.5 on Ubuntu 22.04 LTS (ZFS 2.1.5) and shiftfs support?

Which was pointing me in the right direction to update the zfs module on the non working system. TrueNAS was released with version 2.2.3-1 which should contain full id-mapping support but doesn’t obviously work with incus (properly because of TrueNAS modifications). So I followed the instructions from @stgraber at ZFS builds which installed 2.2.4, rebooted and the permissions denied issue was gone.

Success, now I have a working TrueNAS with Incus LTS which is pretty cool!

Leaves one obvious question what is the difference between @stgraber zfs sources compared the TrueNAS tree? ID mapping is still a new feature in zfs and as such it will take some more time for stabilisation? May be @stgraber can give some useful input which area to concentrate on to find the needle in the haystack.

Not sure what they may have changed. My kernel builds are made from clean upstream code.

Thanks @stgraber, that is all I need to know.

TrueNAS or better IX-Systems made a lot of improvements on ZFS to allow better memory allocation / usage and properly more for their needs. One of these changes broke something in ID mapping.

Properly stay on upstream sources for now as it still all seem to work.

Just to round this up.

There are indeed a lot of changes in the TrueNAS ZFS sources. Was able to track it down to a security function they have added which isn’t fully namespace aware or better doesn’t work as expected. Changed one line of code to use the incoming correct variable from the function call made it work again. Will watch their upcoming release if this will be solved.

Thanks again for the input.

2 Likes