Help - nested lxd containers: lxd service does not start inside container

Hello,

host and container (unprivileged) are ubuntu:18.10. LXDs (version 3.10) have been installed via snap. The LXD on the host starts fine but the one in the container does not. Security nesting flag has been set to true on the container. Here is the output of starting lxd in the container:

Error: Failed to connect to local LXD: Get http://unix.socket/1.0: dial unix /var/snap/lxd/common/lxd/unix.socket: connect: connection refused

Here is the output of systemctl of snap.lxd.daemon on the container:

Mar 03 05:39:19 a systemd[1]: Started Service for snap application lxd.daemon.
Mar 03 05:39:20 a lxd.daemon[810]: => Preparing the system
Mar 03 05:39:20 a lxd.daemon[810]: ==> Loading snap configuration
Mar 03 05:39:20 a lxd.daemon[810]: ==> Setting up mntns symlink (mnt:[4026532984])
Mar 03 05:39:20 a lxd.daemon[810]: ==> Setting up kmod wrapper
Mar 03 05:39:20 a lxd.daemon[810]: ==> Preparing /boot
Mar 03 05:39:20 a lxd.daemon[810]: ==> Preparing a clean copy of /run
Mar 03 05:39:20 a lxd.daemon[810]: ==> Preparing a clean copy of /etc
Mar 03 05:39:20 a lxd.daemon[810]: ==> Setting up ceph configuration
Mar 03 05:39:20 a lxd.daemon[810]: ==> Setting up LVM configuration
Mar 03 05:39:20 a lxd.daemon[810]: ==> Rotating logs
Mar 03 05:39:20 a lxd.daemon[810]: ==> Setting up ZFS (0.7)
Mar 03 05:39:20 a lxd.daemon[810]: ==> Escaping the systemd cgroups
Mar 03 05:39:20 a lxd.daemon[400]: mount namespace: 7
Mar 03 05:39:20 a lxd.daemon[400]: hierarchies:
Mar 03 05:39:20 a lxd.daemon[400]:   0: fd:   8: hugetlb
Mar 03 05:39:20 a lxd.daemon[400]:   1: fd:   9: blkio
Mar 03 05:39:20 a lxd.daemon[400]:   2: fd:  10: memory
Mar 03 05:39:20 a lxd.daemon[400]:   3: fd:  11: pids
Mar 03 05:39:20 a lxd.daemon[400]:   4: fd:  12: cpu,cpuacct
Mar 03 05:39:20 a lxd.daemon[400]:   5: fd:  13: freezer
Mar 03 05:39:20 a lxd.daemon[400]:   6: fd:  14: cpuset
Mar 03 05:39:20 a lxd.daemon[400]:   7: fd:  15: devices
Mar 03 05:39:20 a lxd.daemon[400]:   8: fd:  16: rdma
Mar 03 05:39:20 a lxd.daemon[400]:   9: fd:  17: net_cls,net_prio
Mar 03 05:39:20 a lxd.daemon[400]:  10: fd:  18: perf_event
Mar 03 05:39:20 a lxd.daemon[400]:  11: fd:  19: name=systemd
Mar 03 05:39:20 a lxd.daemon[400]:  12: fd:  20: unified
Mar 03 05:39:20 a lxd.daemon[400]: lxcfs.c: 105: do_reload: lxcfs: reloaded
Mar 03 05:39:20 a lxd.daemon[810]: => Re-using existing LXCFS
Mar 03 05:39:20 a lxd.daemon[810]: => Starting LXD
Mar 03 05:39:20 a lxd.daemon[810]: chgrp: changing group of '/var/snap/lxd/common/lxd/unix.socket': Invalid argument
Mar 03 05:39:20 a systemd[1]: snap.lxd.daemon.service: Main process exited, code=exited, status=1/FAILURE
Mar 03 05:39:20 a systemd[1]: snap.lxd.daemon.service: Failed with result 'exit-code'.
Mar 03 05:39:21 a systemd[1]: snap.lxd.daemon.service: Service RestartSec=100ms expired, scheduling restart.
Mar 03 05:39:21 a systemd[1]: snap.lxd.daemon.service: Scheduled restart job, restart counter is at 5.
Mar 03 05:39:21 a systemd[1]: Stopped Service for snap application lxd.daemon.
Mar 03 05:39:21 a systemd[1]: snap.lxd.daemon.service: Start request repeated too quickly.
Mar 03 05:39:21 a systemd[1]: snap.lxd.daemon.service: Failed with result 'exit-code'.
Mar 03 05:39:21 a systemd[1]: Failed to start Service for snap application lxd.daemon.

I had a disk device mounted in the container and used ‘raw.idmap “both 1000 1000”’. I have also allocated large sub{uid,gid} blocks on the host. I have disabled the device disk but I get other errors like ‘lxd init’ hangs.

I am not sure what could be wrong here. Any tips/ideas would be most welcoming. Thanks

What’s the output of getent group lxd?

in the container, the output of getent group lxd in the LXD container is lxd:x:1000:ubuntu. After some reinstalling and careful editing of properties of configs on the container, it finally started. So, now I am able to launch LXD inside an unprivileged LXD container. I do not know what I did differently.