Container fails to start: newuidmap failed to write mapping "newuidmap: uid range [0-1000000000) -> [1000000-1001000000) not allowed

Last time I had this problem it was solved by creating /etc/subuid and /etc/subgid files with an appropriate root entry. No such luck this time:

OS: Arch Linux
LXD: 4.19

I created /etc/subuid and /etc/subgid files with this content:

root:100000:65536
lxd:100000:65536

I also added the following lines to /etc/lxc/default.conf:

lxc.idmap = u 0 100000 65536
lxc.idmap = g 0 100000 65536

which I didn’t do last time. I tried starting the container with and without these lines in /etc/lxc/default.conf and also tried placing them in /etc/default/lxc (not sure why I have both of these on my system). I did remember to restart lxd each time I changed the configuration.

However, with or without the lxc.idmap entries, when I try to start the container, the following errors ensue:

lxc samba-dc 20211020015746.465 ERROR    conf - conf.c:lxc_map_ids:3471 - newuidmap failed to write mapping "newuidmap: uid range [0-1000000000) -> [1000000-1001000000) not allowed": newuidmap 4259 0 1000000 1000000000
lxc samba-dc 20211020015746.465 ERROR    start - start.c:lxc_spawn:1774 - Failed to set up id       mapping.
lxc samba-dc 20211020015746.465 ERROR    lxccontainer - lxccontainer.c:wait_on_daemonized_start:    868 - Received container state "ABORTING" instead of "RUNNING"
lxc samba-dc 20211020015746.465 ERROR    start - start.c:__lxc_start:2053 - Failed to spawn         container "samba-dc"
lxc samba-dc 20211020015746.465 WARN     start - start.c:lxc_abort:1050 - No such process - Failed  to send SIGKILL via pidfd 20 for process 4259
lxc 20211020015751.527 ERROR    af_unix - af_unix.c:lxc_abstract_unix_recv_fds_iov:220 - Connection reset by peer - Failed to receive response
lxc 20211020015751.527 ERROR    commands - commands.c:lxc_cmd_rsp_recv_fds:129 - Failed to receive  file descriptors

So, it looks like it’s trying to use the default uid mapping range despite the /etc/subuid, /etc/subgid, and /etc/lxd/default.conf entries? I must have the lxd.idmap syntax wrong or am including the instructions in the wrong location. Should I be using raw.idmap in the profile instead?

Also, systemctl restart lxd is extremely slow. I tried strace’ing the PID to see where it is getting stuck, but the strace shows nothing illuminating:

ppoll([{fd=3, events=POLLIN}], 1, NULL, NULL, 8

One detail I omitted from the description above is that I’m initializing/launching the container as an unprivileged user in the lxd group:

[pgoetz@gecko ~]$ whoami
pgoetz
$ lxc launch images:ubuntu/20.04 samba-dc

The system is configured to allow unprivileged users the create user namespaces:

[root@gecko default]# sysctl kernel.unprivileged_userns_clone
kernel.unprivileged_userns_clone = 1
[root@gecko default]# sysctl user.max_user_namespaces
user.max_user_namespaces = 127583

However my guess, based on the information provided here is that I’m unable to delegate a necessary cgroup as an unprivileged user. I’m not sure exactly what libpam-cgfs does, but this package doesn’t appear to be available on Arch, and this instruction from that page:

systemd-run --unit=myshell --user --scope -p "Delegate=yes" lxc-start <container-name>

isn’t sufficiently explained for me to feel comfortable using it. Given the complications, it occurs to me that I can’t think of any really good reason why I should be launching containers as myself? Since the root user can launch unprivileged containers, the path of least resistance (without, I believe, compromising security?) is to launch the container as root:

[root@gecko ~]# lxc launch images:ubuntu/20.04 samba-dc

That appears to just work:

[root@gecko ~]# lxc list
+----------+---------+----------------------+------+-----------+-----------+
|   NAME   |  STATE  |         IPV4         | IPV6 |   TYPE    | SNAPSHOTS |
+----------+---------+----------------------+------+-----------+-----------+
| samba-dc | RUNNING | 192.168.1.170 (eth0) |      | CONTAINER | 0         |
+----------+---------+----------------------+------+-----------+-----------+

Rather than use lxdbr0, I initialized lxd to use an existing bridge (with a NIC bound to it) because I want/need this container to be public facing. This is clearly working, as the IP address above was assigned by my DHCP server.

While not technically solved, I’m going to mark this topic as solved. There are way too many moving parts to getting this working with an unprivileged user launching containers, and I’m not seeing a single benefit of doing so. I’m going to repeat this installation in a production environment using the pre-installed 4.0/stable snap on Ubuntu 20.04. Presumably I’ll run into the same issues there and will need to similarly launch the container as root.

I wonder if these issues would be less of a headache with better utilization of linux capabilities? Anyway, not a problem I can solve right now.