Unprivileged container delays host shutdown

Hi,

systemd seems quite unhappy when a host is shutting down and there is an unprivileged container still running:

A stop job is running for Session 2 of user zub (30s / 1min 30s)

The host does eventually shut down, but only after waiting 1.5 min for - presumably - lxc to shut down. Looking into systemd journal I don’t see much that would seem relevant, perhaps except for:

Feb 13 10:07:18 bug systemd-logind[969]: System is rebooting.
Feb 13 10:07:18 bug systemd[1]: Stopping Session 2 of user zub.
...
Feb 13 10:07:18 bug systemd[1]: lxc.service: Control process exited, code=exited, status=1/FAILURE
Feb 13 10:07:18 bug systemd[1]: lxc.service: Failed with result 'exit-code'.
Feb 13 10:07:18 bug systemd[1]: Stopped LXC Container Initialization and Autoboot Code.
...
Feb 13 10:07:18 bug systemd[1892]: var-lib-lxcfs.mount: Succeeded.
Feb 13 10:07:18 bug systemd[1]: Stopped LSB: NFC daemon.
Feb 13 10:07:18 bug systemd[1]: var-lib-lxcfs.mount: Succeeded.
Feb 13 10:07:18 bug systemd[1]: Unmounted /var/lib/lxcfs.
...
Feb 13 10:07:18 bug fusermount[5717]: /bin/fusermount: failed to unmount /var/lib/lxcfs: Invalid argument
...
Feb 13 10:07:18 bug systemd[1]: lxc-net.service: Succeeded.
Feb 13 10:07:18 bug systemd[1]: Stopped LXC network bridge setup.
Feb 13 10:07:18 bug systemd[1]: lxcfs.service: Succeeded.
Feb 13 10:07:18 bug systemd[1]: Stopped FUSE filesystem for LXC.
...
Feb 13 10:08:48 bug systemd[1]: session-2.scope: Stopping timed out. Killing.
Feb 13 10:08:48 bug systemd[1]: session-2.scope: Killing process 4741 (lxc-start) with signal SIGKILL.
Feb 13 10:08:48 bug systemd[1]: session-2.scope: Killing process 4751 (systemd) with signal SIGKILL.
Feb 13 10:08:48 bug systemd[1]: session-2.scope: Killing process 5765 (systemd-journal) with signal SIGKILL.
Feb 13 10:08:48 bug systemd[1]: session-2.scope: Killing process 5767 (systemd-logind) with signal SIGKILL.
Feb 13 10:08:48 bug systemd[1]: session-2.scope: Killing process 5770 (dbus-daemon) with signal SIGKILL.
Feb 13 10:08:48 bug systemd[1]: session-2.scope: Killing process 5766 (systemd-network) with signal SIGKILL.
Feb 13 10:08:48 bug systemd[1]: session-2.scope: Killing process 5771 (rsyslogd) with signal SIGKILL.
Feb 13 10:08:48 bug systemd[1]: session-2.scope: Killing process 5774 (systemd-resolve) with signal SIGKILL.
Feb 13 10:08:48 bug systemd[1]: session-2.scope: Killing process 5769 (agetty) with signal SIGKILL.
Feb 13 10:08:48 bug systemd[1]: session-2.scope: Killing process 5763 (agetty) with signal SIGKILL.
Feb 13 10:08:48 bug systemd[1]: session-2.scope: Killing process 5768 (agetty) with signal SIGKILL.
Feb 13 10:08:48 bug systemd[1]: session-2.scope: Killing process 5764 (agetty) with signal SIGKILL.
Feb 13 10:08:48 bug systemd[1]: session-2.scope: Killing process 5762 (agetty) with signal SIGKILL.
Feb 13 10:08:48 bug systemd[1]: session-2.scope: Killing process 5781 (gdm3) with signal SIGKILL.
Feb 13 10:08:48 bug systemd[1]: session-2.scope: Killing process 5785 (accounts-daemon) with signal SIGKILL.
Feb 13 10:08:48 bug systemd[1]: session-2.scope: Killing process 5794 (polkitd) with signal SIGKILL.
Feb 13 10:08:48 bug systemd[1]: session-2.scope: Killing process 5786 (gmain) with signal SIGKILL.
Feb 13 10:08:48 bug systemd[1]: session-2.scope: Killing process 5795 (gmain) with signal SIGKILL.
Feb 13 10:08:48 bug systemd[1]: session-2.scope: Failed with result 'timeout'.
Feb 13 10:08:48 bug systemd[1]: Stopped Session 2 of user zub.
Feb 13 10:08:48 bug systemd[1]: Stopping Login Service...
Feb 13 10:08:48 bug systemd[1]: Stopping User Manager for UID 1002...
...

I wonder why is this happening. I would expect systemd to notify lxc it should shut down (perhaps via SIGHUP or so?), and I would expect lxc then to shut the container down.

If I run lxc-stop -n <container-name> to stop the container manually prior to shut down, then the container quickly shuts down and when I then shut down the host, it also shuts down correctly.

I’m on Debian testing and the containers I run are various versions of Ubuntu. I’m running lxc 3.0.4 and systemd 244.

I’m grateful for any ideas/pointers on where to look/explanations about what happens in this case.

Some more poking around shows that sending SIGHUP or SIGTERM to the lxc monitor process (which is the parent of all the processes inside the container - at least as far as I can tell), doesn’t stop the container.

If I send it some more magical signals like SIGRTMIN+4 , it does halt the container. This seems to indicate that the lxc monitor propagates the signal to the systemd running inside (which, according to systemd man page, does a restart on SIGRTMIN+4). Maybe it also propagates SIGHUP and SIGTERM but these don’t shut down the container (and I don’t see them produce anything in journal, but nothing shows up in journal even if I run kill -HUP 1 inside the container.)

There are the halt, reboot and stop signals that can be configure in lxc (https://linuxcontainers.org/lxc/manpages/man5/lxc.container.conf.5.html#lbAG). And lxc-stop man page says lxc-stop just sends the lxc.signal.stop signal to the container’s init process.

These are some of the pieces of the puzzle. What I’m still missing is: What does the lxc monitor do, when it receives a signal? And why doesn’t it translate e.g. SIGTERM into the lxc.signal.stop for the init process?