Nested lxc container: Container in Container cant lxc-attach to it

Hello linuxcontainers,

i try to setup a development host on Debian bullseye/arm64 (stable) with lxc 4.0.6-2. Host and guests are all the same OS. I already run lxc on another machine but not nested.

  +---------------------------------------------+
  | Physical host (Raspberry pi4)               |
  +---------------------------------------------+
  |                                             |
  | +-----------------------------------------+ |
  | | Ansible                                 | |
  | +-----------------------------------------+ |
  | | uid 10000000 10000000 root              | |
  | | gid 10000000 10000000 root              | |
  | |                                         | |
  | |  +--------+  +---------+  +----------+  | |
  | |  |dev-db  |  |dev-www  |  |dev-mail  |  | |
  | |  +--------+  +---------+  +----------+  | |
  | |  |u 100000|  |u 200000 |  |u 300000  |  | |
  | |  |  65536 |  |  65536  |  |  65536   |  | |
  | |  |        |  |         |  |          |  | |
  | |  |g 100000|  |g 200000 |  |g 300000  |  | |
  | |  |  65536 |  |  65536  |  |  65536   |  | |
  | |  +--------+  +---------+  +----------+  | |
  | +-----------------------------------------+ |
  +---------------------------------------------+

Host

# Template used to create this container: /usr/share/lxc/templates/lxc-download
# Parameters passed to the template:
# For additional config options, please look at lxc.container.conf(5)

# Uncomment the following line to support nesting containers:
#lxc.include = /usr/share/lxc/config/nesting.conf
# (Be aware this has security implications)


# Distribution configuration
lxc.include = /usr/share/lxc/config/common.conf
lxc.include = /usr/share/lxc/config/userns.conf
lxc.arch = linux64

# Container specific configuration
lxc.apparmor.profile = generated
lxc.apparmor.allow_nesting = 1
lxc.idmap = u 0 10000000 10000000
lxc.idmap = g 0 10000000 10000000
lxc.rootfs.path = dir:/var/lib/lxc/ansible/rootfs
lxc.uts.name = ansible

# enable apparmor inside this contrainer
lxc.mount.entry = /sys/kernel/security sys/kernel/security none bind,optional 0 0
lxc.mount.entry = /mnt/btrfs/lxcsub    var/lib/lxc         none bind,optional 0 0

# Network configuration
lxc.net.0.type = veth
lxc.net.0.link = br0
lxc.net.0.flags = up
lxc.net.0.hwaddr = 02:FF:BB:00:00:01

# autostart
lxc.start.auto = 1

The nested containers look like this:

root@ansible:/# cat /var/lib/lxc/dev-db/config 
# Template used to create this container: /usr/share/lxc/templates/lxc-download
# Parameters passed to the template: --arch arm64 --dist debian --release bullseye
# For additional config options, please look at lxc.container.conf(5)

# Uncomment the following line to support nesting containers:
#lxc.include = /usr/share/lxc/config/nesting.conf
# (Be aware this has security implications)

# enable apparmor inside this contrainer

# Distribution configuration
lxc.include = /usr/share/lxc/config/common.conf
lxc.include = /usr/share/lxc/config/userns.conf
lxc.arch = linux64

# Container specific configuration
lxc.apparmor.profile = generated
lxc.apparmor.allow_nesting = 1
lxc.idmap = u 0 100000 65536
lxc.idmap = g 0 100000 65536
lxc.mount.entry = /sys/kernel/security sys/kernel/security none bind,optional 0 0
lxc.rootfs.path = dir:/var/lib/lxc/dev-db/rootfs
lxc.uts.name = dev-db

# Network configuration
lxc.net.0.type = veth
lxc.net.0.link = lxcbr0
lxc.net.0.flags = up
lxc.net.0.hwaddr = 02:FF:AA:00:10:01

In htop i can see the “stacked” uids of the processes like expected. In the “ansible” container i can see them runnign with lxc-ls --fancy but a lxc-attach -n dev-db fails with

root@ansible:/# lxc-attach -n dev-db
lxc-attach: dev-db: conf.c: userns_exec_minimal: 4242 Permission denied - Running parent function failed

I also tried the lxc-unpriv-start and lxc-unpriv-attach commands but they also fail with that error message.

Important software components:

systemd on all host/containers: 247.3-7+deb11u1
kernel: Debian 5.18.16-1~bpo11+1 (2022-08-12) aarch64 GNU/Linux

Oh …

sudo sysctl kernel.unprivileged_userns_clone
kernel.unprivileged_userns_clone = 1

So: What could be the issue that this second attach fails?

EDIT: I also uncommented

lxc.include = /usr/share/lxc/config/nesting.conf

on all configs but it didnt work…

EDIT2:

georg@rpi4-rt:~$ sudo lxc-attach -n ansible
root@ansible:/# lxc-attach -n dev-db
lxc-attach: dev-db: conf.c: userns_exec_minimal: 4242 Permission denied - Running parent function failed
                                                                                                        root@dev-server:/# 

This is the way i tried this.

This is the log

lxc-attach dev-db 20221121180413.349 INFO     confile - confile.c:set_config_idmaps:1942 - Read uid map: type u nsid 0 hostid 100000 range 65536
lxc-attach dev-db 20221121180413.349 INFO     confile - confile.c:set_config_idmaps:1942 - Read uid map: type g nsid 0 hostid 100000 range 65536
lxc-attach dev-db 20221121180413.350 DEBUG    commands - commands.c:lxc_cmd_rsp_recv:172 - Response data length for command "get_init_pid" is 0
lxc-attach dev-db 20221121180413.350 DEBUG    commands - commands.c:lxc_cmd_rsp_recv:172 - Response data length for command "get_init_pid" is 0
lxc-attach dev-db 20221121180413.351 INFO     lsm - lsm/lsm.c:lsm_init:40 - Initialized LSM security driver AppArmor
lxc-attach dev-db 20221121180413.354 INFO     seccomp - seccomp.c:use_seccomp:1179 - Already seccomp-confined, not loading new policy
lxc-attach dev-db 20221121180413.355 INFO     attach - attach.c:fetch_seccomp:598 - Retrieved seccomp policy
lxc-attach dev-db 20221121180413.357 DEBUG    commands - commands.c:lxc_cmd_rsp_recv:172 - Response data length for command "get_clone_flags" is 0
lxc-attach dev-db 20221121180413.358 DEBUG    commands - commands.c:lxc_cmd_rsp_recv:172 - Response data length for command "get_devpts_fd" is 0
lxc-attach dev-db 20221121180413.359 DEBUG    terminal - terminal.c:lxc_terminal_peer_default:672 - Using terminal "/dev/tty" as proxy
lxc-attach dev-db 20221121180413.359 DEBUG    terminal - terminal.c:lxc_terminal_winsz:60 - Set window size to 184 columns and 50 rows
lxc-attach dev-db 20221121180413.360 DEBUG    commands - commands.c:lxc_cmd_rsp_recv:172 - Response data length for command "get_cgroup2_fd" is 0
lxc-attach dev-db 20221121180413.361 DEBUG    conf - conf.c:idmaptool_on_path_and_privileged:2728 - The binary "/usr/bin/newuidmap" does have the setuid bit set
lxc-attach dev-db 20221121180413.361 DEBUG    conf - conf.c:idmaptool_on_path_and_privileged:2728 - The binary "/usr/bin/newgidmap" does have the setuid bit set
lxc-attach dev-db 20221121180413.361 DEBUG    conf - conf.c:lxc_map_ids:2796 - Functional newuidmap and newgidmap binary found
lxc-attach dev-db 20221121180413.379 NOTICE   utils - utils.c:lxc_setgroups:1420 - Dropped additional groups
lxc-attach dev-db 20221121180413.379 DEBUG    cgfsng - cgroups/cgfsng.c:cgroup_attach_create_leaf:2288 - Sent target cgroup fds 4 and 7
lxc-attach dev-db 20221121180413.379 DEBUG    cgfsng - cgroups/cgfsng.c:cgroup_attach_move_into_leaf:2316 - Permission denied - Failed to move process into target cgroup via fd 6 and 7
lxc-attach dev-db 20221121180413.379 ERROR    conf - conf.c:userns_exec_minimal:4242 - Permission denied - Running parent function failed

I traced it back until this function

Which suggests (following the logs) that one fd could be sent … but i have no idea what could be the root cause …

Any ideas @amikhalitsyn ?

Looks related to cgroups: modify cgroup2 attach logic by brauner · Pull Request #4090 · lxc/lxc · GitHub

Your version of LXC is too old, so it doesn’t have this patch included. I can suggest you to update LXC and retry. If this does not help we will debug your case in detail.

I thought this might be useful… systemd-cgls. Both from within my “ansible” container.

Started with lxc-start -n dev-db

Control group /:
-.slice
├─lxc.payload.dev-db-1 
│ ├─init.scope 
│ │ └─623 /sbin/init
│ └─system.slice 
│   ├─systemd-networkd.service 
│   │ └─721 /lib/systemd/systemd-networkd
│   ├─systemd-udevd.service 
│   │ └─717 /lib/systemd/systemd-udevd
│   ├─systemd-journald.service 
│   │ └─708 /lib/systemd/systemd-journald
│   ├─console-getty.service 
│   │ └─735 /sbin/agetty -o -p -- \u --noclear --keep-baud console 115200,38400…
│   ├─systemd-resolved.service 
│   │ └─723 /lib/systemd/systemd-resolved
│   ├─system-container\x2dgetty.slice 
│   │ ├─container-getty@0.service 
│   │ │ └─736 /sbin/agetty -o -p -- \u --noclear --keep-baud pts/0 115200,38400…
│   │ ├─container-getty@2.service 
│   │ │ └─738 /sbin/agetty -o -p -- \u --noclear --keep-baud pts/2 115200,38400…
│   │ ├─container-getty@1.service 
│   │ │ └─737 /sbin/agetty -o -p -- \u --noclear --keep-baud pts/1 115200,38400…
│   │ └─container-getty@3.service 
│   │   └─739 /sbin/agetty -o -p -- \u --noclear --keep-baud pts/3 115200,38400…
│   ├─dbus.service 
│   │ └─727 /usr/bin/dbus-daemon --system --address=systemd: --nofork --nopidfi…
│   └─systemd-logind.service 
│     └─729 /lib/systemd/systemd-logind
├─user.slice 
│ └─user-0.slice 
│   └─user@0.service 
│     └─init.scope 
│       ├─140 /lib/systemd/systemd --user
│       └─178 (sd-pam)
├─.lxc 
│ ├─446 /bin/bash
│ └─750 systemd-cgls
├─init.scope 
│ └─1 /sbin/init
├─system.slice 
│ ├─lxc-net.service 
│ │ └─260 dnsmasq --conf-file=/dev/null -u dnsmasq --strict-order --bind-interf…
│ ├─systemd-networkd.service 
│ │ └─87 /lib/systemd/systemd-networkd
│ ├─systemd-udevd.service 
│ │ └─77 /lib/systemd/systemd-udevd
│ ├─cron.service 
│ │ └─99 /usr/sbin/cron -f
│ ├─systemd-journald.service 
│ │ └─67 /lib/systemd/systemd-journald
│ ├─ssh.service 
│ │ └─128 sshd: /usr/sbin/sshd -D [listener] 0 of 10-100 startups
│ ├─exim4.service 
│ │ └─435 /usr/sbin/exim4 -bd -q30m
│ ├─console-getty.service 
│ │ └─117 /sbin/agetty -o -p -- \u --noclear --keep-baud console 115200,38400,9…
│ ├─rdnssd.service 
│ │ ├─108 /sbin/rdnssd -u rdnssd -H /etc/rdnssd/merge-hook
│ │ └─109 /sbin/rdnssd -u rdnssd -H /etc/rdnssd/merge-hook
│ ├─systemd-resolved.service 
│ │ └─91 /lib/systemd/systemd-resolved
│ ├─system-container\x2dgetty.slice 
│ │ ├─container-getty@0.service 
│ │ │ └─118 /sbin/agetty -o -p -- \u --noclear --keep-baud pts/0 115200,38400,9…
│ │ ├─container-getty@2.service 
│ │ │ └─120 /sbin/agetty -o -p -- \u --noclear --keep-baud pts/2 115200,38400,9…
│ │ ├─container-getty@1.service 
│ │ │ └─119 /sbin/agetty -o -p -- \u --noclear --keep-baud pts/1 115200,38400,9…
│ │ └─container-getty@3.service 
│ │   └─121 /sbin/agetty -o -p -- \u --noclear --keep-baud pts/3 115200,38400,9…
│ ├─dbus.service 
│ │ └─100 /usr/bin/dbus-daemon --system --address=systemd: --nofork --nopidfile…
│ └─systemd-logind.service 
│   └─103 /lib/systemd/systemd-logind
└─lxc.monitor.dev-db 
  └─618 [lxc monitor] /var/lib/lxc dev-db

Started with lxc-unpriv-start -n dev-db

Control group /:
-.slice
├─lxc.payload.dev-db-2 
│ ├─init.scope 
│ │ └─781 /sbin/init
│ └─system.slice 
│   ├─systemd-networkd.service 
│   │ └─877 /lib/systemd/systemd-networkd
│   ├─systemd-udevd.service 
│   │ └─875 /lib/systemd/systemd-udevd
│   ├─systemd-journald.service 
│   │ └─866 /lib/systemd/systemd-journald
│   ├─systemd-hostnamed.service 
│   │ └─897 /lib/systemd/systemd-hostnamed
│   ├─console-getty.service 
│   │ └─889 /sbin/agetty -o -p -- \u --noclear --keep-baud console 115200,38400…
│   ├─systemd-resolved.service 
│   │ └─881 /lib/systemd/systemd-resolved
│   ├─system-container\x2dgetty.slice 
│   │ ├─container-getty@0.service 
│   │ │ └─890 /sbin/agetty -o -p -- \u --noclear --keep-baud pts/0 115200,38400…
│   │ ├─container-getty@2.service 
│   │ │ └─892 /sbin/agetty -o -p -- \u --noclear --keep-baud pts/2 115200,38400…
│   │ ├─container-getty@1.service 
│   │ │ └─891 /sbin/agetty -o -p -- \u --noclear --keep-baud pts/1 115200,38400…
│   │ └─container-getty@3.service 
│   │   └─893 /sbin/agetty -o -p -- \u --noclear --keep-baud pts/3 115200,38400…
│   ├─dbus.service 
│   │ └─885 /usr/bin/dbus-daemon --system --address=systemd: --nofork --nopidfi…
│   └─systemd-logind.service 
│     └─887 /lib/systemd/systemd-logind
├─user.slice 
│ └─user-0.slice 
│   └─user@0.service 
│     └─init.scope 
│       ├─140 /lib/systemd/systemd --user
│       └─178 (sd-pam)
├─.lxc 
│ ├─446 /bin/bash
│ └─899 systemd-cgls
├─init.scope 
│ └─1 /sbin/init
├─system.slice 
│ ├─lxc-net.service 
│ │ └─260 dnsmasq --conf-file=/dev/null -u dnsmasq --strict-order --bind-interf…
│ ├─systemd-networkd.service 
│ │ └─87 /lib/systemd/systemd-networkd
│ ├─systemd-udevd.service 
│ │ └─77 /lib/systemd/systemd-udevd
│ ├─cron.service 
│ │ └─99 /usr/sbin/cron -f
│ ├─systemd-journald.service 
│ │ └─67 /lib/systemd/systemd-journald
│ ├─ssh.service 
│ │ └─128 sshd: /usr/sbin/sshd -D [listener] 0 of 10-100 startups
│ ├─exim4.service 
│ │ └─435 /usr/sbin/exim4 -bd -q30m
│ ├─console-getty.service 
│ │ └─117 /sbin/agetty -o -p -- \u --noclear --keep-baud console 115200,38400,9…
│ ├─rdnssd.service 
│ │ ├─108 /sbin/rdnssd -u rdnssd -H /etc/rdnssd/merge-hook
│ │ └─109 /sbin/rdnssd -u rdnssd -H /etc/rdnssd/merge-hook
│ ├─systemd-resolved.service 
│ │ └─91 /lib/systemd/systemd-resolved
│ ├─system-container\x2dgetty.slice 
│ │ ├─container-getty@0.service 
│ │ │ └─118 /sbin/agetty -o -p -- \u --noclear --keep-baud pts/0 115200,38400,9…
│ │ ├─container-getty@2.service 
│ │ │ └─120 /sbin/agetty -o -p -- \u --noclear --keep-baud pts/2 115200,38400,9…
│ │ ├─container-getty@1.service 
│ │ │ └─119 /sbin/agetty -o -p -- \u --noclear --keep-baud pts/1 115200,38400,9…
│ │ └─container-getty@3.service 
│ │   └─121 /sbin/agetty -o -p -- \u --noclear --keep-baud pts/3 115200,38400,9…
│ ├─dbus.service 
│ │ └─100 /usr/bin/dbus-daemon --system --address=systemd: --nofork --nopidfile…
│ └─systemd-logind.service 
│   └─103 /lib/systemd/systemd-logind
└─lxc.monitor.dev-db 
  └─776 [lxc monitor] /var/lib/lxc dev-db

It seems to be related to the unprivileged(idmap) container. A privileged(no idmap) and unconfined(AA profile) setup works. unconfined(AA profile) and unprivileged(idmap) setup shows the same error.

Edit: outer container (ansible) generated AA Profile + idmap and inner container unconfined and without AA profile also works.