Failing to start unprivileged container (QNAP)

Hi,

I’m having difficulties to run a unprivileged container using LXD, when I launch an alpine container (teste) with this config, it fails asking for permissions on the /var/lib/lxd, permissions that are already there, here is the step by step:

$ lxc config show teste

    architecture: x86_64
    config:
      boot.autostart: "true"
      image.architecture: amd64
      image.description: Alpine edge amd64 (20210906_13:00)
      image.os: Alpine
      image.release: edge
      image.serial: "20210906_13:00"
      image.type: squashfs
      image.variant: default
      security.idmap.base: "1000000"
      security.idmap.size: "65536"
      security.privileged: "false"
      volatile.base_image: a5cd77b17561dc20d7cefd3b482301dc43923b10fe6887a1a7593b77b7ac5e46
      volatile.idmap.base: "0"
      volatile.last_state.power: STOPPED
      volatile.uuid: 89f366b9-bc8d-4ac2-9a55-dfeb132f7184
    devices: {}
    ephemeral: false
    profiles:
    - default
    stateful: false
    description: ""

$ lxc start teste

Error: Failed to run: /path/to/lxd forkstart teste /var/lib/lxd/containers /var/log/lxd/teste/lxc.conf:
Try `lxc info --show-log teste` for more info

$ lxc info --show-log teste

Name: teste
Location: none
Remote: unix://
Architecture: x86_64
Created: 2021/09/23 15:55 UTC
Status: Stopped
Type: container
Profiles: default

Log:

lxc teste 20210924111701.964 WARN     cgfsng - cgroups/cgfsng.c:cg_hybrid_get_controllers:657 - Found hierarchy not under /sys/fs/cgroup: "/dev/cgroups_antivirus rw,relatime shared:146 - cgroup memory rw,memory
"
lxc teste 20210924111701.974 WARN     cgfsng - cgroups/cgfsng.c:mkdir_eexist_on_last:1152 - File exists - Failed to create directory "/sys/fs/cgroup/cpuset//lxc.monitor.teste"
lxc teste 20210924111701.993 WARN     cgfsng - cgroups/cgfsng.c:mkdir_eexist_on_last:1152 - File exists - Failed to create directory "/sys/fs/cgroup/cpuset//lxc.payload.teste"
lxc teste 20210924111702.145 ERROR    start - start.c:print_top_failing_dir:98 - Permission denied - Could not access /var/lib/lxd. Please grant it x access, or add an ACL for the container root
lxc teste 20210924111702.162 ERROR    sync - sync.c:__sync_wait:36 - An error occurred in another process (expected sequence number 3)
lxc teste 20210924111702.164 WARN     network - network.c:lxc_delete_network_priv:3185 - Failed to rename interface with index 3 from "eth0" to its initial name "veth4d0ebce6"
lxc teste 20210924111702.166 ERROR    start - start.c:__lxc_start:1999 - Failed to spawn container "teste"
lxc teste 20210924111702.166 WARN     start - start.c:lxc_abort:1013 - No such process - Failed to send SIGKILL via pidfd 26 for process 2715474
lxc teste 20210924111702.198 ERROR    lxccontainer - lxccontainer.c:wait_on_daemonized_start:860 - Received container state "ABORTING" instead of "RUNNING"
lxc 20210924111702.391 WARN     commands - commands.c:lxc_cmd_rsp_recv:126 - Connection reset by peer - Failed to receive response for command "get_state"

The /var/lib/lxd already has x permission for everyone, so I’ve tried follow the permissions from the container rootfs using namei:

$ namei -mvo /var/lib/lxd/storage-pools/default/containers/teste/rootfs/

Permission User Group Dir
drwxrwxr-x root root var
drwxr-xr-x root root lib
drwx–x–x root root lxd
drwx–x–x root root storage-pools
drwx–x–x root root default
drwx–x–x root root containers
d–x------ root root teste
drwxr-xr-x 1000000 1000000 rootfs

The only directory without x or ACL for others is the /var/lib/lxd/storage-pools/default/containers/teste, I’ve tried to change and test if that could fix the errors, but every time I try to start the container the permissions change back to the original d–x------

So here is where I’m stuck and asking for help, any input is appreciated!

Also just to be clear my setup is a QNAP NAS, so is a heavily customized Linux, I’m aware this could be de cause why only privileged container are running.

Was this container originally a privileged one?

Can you show output of lxc storage show default.

For an example, this is what a container on a dir pool should look like:

namei -mvo /var/lib/lxd/storage-pools/default/containers/test_c1/rootfs/
f: /var/lib/lxd/storage-pools/default/containers/test_c1/rootfs/
drwxr-xr-x root    root /
drwxr-xr-x root    root var
drwxr-xr-x root    root lib
drwx--x--x root    root lxd
drwx--x--x root    root storage-pools
drwx--x--x root    root default
drwx--x--x root    root containers
d--x------ 1000000 root test_c1
drwxr-xr-x root    root rootfs

So the permission on d–x------ root root teste looks OK but the owner does not.

No, it was created as unprivileged from start.

config:
  source: /var/lib/lxd/storage-pools/default
description: Default LXD storage pool
name: default
driver: dir
used_by:
- /1.0/instances/server
- /1.0/instances/teste
- /1.0/profiles/default
- /1.0/profiles/tcs
status: Created
locations:
- none

Can you change the ownership of teste to 1000000:1000000?

Yes, but when I call lxc start teste the permissions and owner resets, I’ve tried setfacl to force new permissions, same problem.

Sorry, but I found the solution.

Everything was caused by directory ACL set by QNAP, I removed all ACL permissions and everything worked.

For anyone out there using QNAP, be aware that using container station on shared folder with “advanced permissions” a.k.a. ACL will broke LXD/LXC and even docker in some cases.

@tomp Thanks for the help and sorry for the wasted time!

1 Like

Excellent, for other readers could you show the command you run to fix this?

In a QTS (QNAP OS) LXD is part of package called Container Station, after installation you must choose a folder that will hold all container and images, lets assume you choose the directory apps, this means the software will create everything under /share/VOLUME_NAME/apps/container-station-data then make a LOT of symlinks.

The problem is that the /share/VOLUME_NAME/apps created by QTS Web Ui, sets ACL permissions, to solve just run:

$ setfacl -Rb /share/VOLUME_NAME/apps

And then everything worked, I tried adding ids to the ACL before but that didn’t work.

1 Like

Mh, I think this is dangerous as you may break other permissions, I’ve tried to be a bit more selective myself, and error by error I got how to get it working:

#!/bin/sh

# Change these values to match your configuration!
CONTAINER_VOLUME="/share/CACHEDEV3_DATA"
CONTAINER_FOLDER="Container"

if [ -z "$1" ] || [ -z "$2" ]; then
  echo "Use as $0 [set|unset] <UID>"
  exit 1
fi

userid="$2"

if [ "$1" == "set" ]; then
  # setfacl -R -m user:$userid:rx /share/CACHEDEV3_DATA/.qpkg/container-station
  setfacl -m user:$userid:rx "$CONTAINER_VOLUME"/.qpkg/container-station
  setfacl -m user:$userid:rx "$CONTAINER_VOLUME"/.qpkg/container-station/lib
  setfacl -m user:$userid:rx "$CONTAINER_VOLUME"/.qpkg/container-station/var
  setfacl -R -m user:$userid:rx "$CONTAINER_VOLUME"/.qpkg/container-station/usr

  setfacl -m user:$userid:rx "$CONTAINER_VOLUME/$CONTAINER_FOLDER"
  setfacl -m user:$userid:rx "$CONTAINER_VOLUME/$CONTAINER_FOLDER"/container-station-data/lib
  setfacl -m user:$userid:rx "$CONTAINER_VOLUME/$CONTAINER_FOLDER"/container-station-data/lib/lxd
  setfacl -m user:$userid:rx /var/lib/lxd
  setfacl -m user:$userid:rx /var/lib/lxd/containers
  setfacl -m user:$userid:rx /var/lib/lxd/devices
  setfacl -m user:$userid:rx /var/lib/lxd/shmounts
  setfacl -m user:$userid:rx /var/lib/lxd/snapshots
  setfacl -m user:$userid:rx /var/lib/lxd/storage-pools
  setfacl -m user:$userid:rx /var/lib/lxd/storage-pools/default/containers
elif [ "$1" == "unset" ]; then
  setfacl -R -x user:$userid "$CONTAINER_VOLUME"/.qpkg/container-station
  setfacl -R -x user:$userid "$CONTAINER_VOLUME/$CONTAINER_FOLDER"
  setfacl -R -x user:$userid /var/lib/lxd/
  setfacl -x user:$userid /var/lib/lxd
else
  echo "Invalid operation"
  exit 1
fi

With this script only the needed folders have the access bit for the non-root user, so by default you can use it in a way such as :

sudo ./change-permissions-for-unprivileged-container.sh set 1000000

While use the unset command to reset them.

@3v1n0, Thank you for providing the script!
I noticed that after applying this script, users in administrators group will be unable to execute commands such as docker and lxc. This is caused by QNAP’s own implementation of ACL mentioned in What’s wrong with ACL?.

To address this issue, I wrote a Python script to modify ACLs, and provided a Docker image to run the Python script. Please visit https://github.com/kobarity/qnaplxdunpriv/ if you are interested in running unprivileged LXD containers on QNAP Container Station.

1 Like

Cool, I did not notice this but maybe was because was still trying something else…

Maybe would be just enough to include the administrators group, but I see you have done a way better method than mine, thanks!

Yes, including the administrators group is sufficient for most cases. I just wanted to change as little as possible.