I have a container that won’t stop, be deleted and seems to be permanently stuck in shutting down mode. lxc stop (-f) does not work, neither does lxc delete -f.
It seems some FUSE magic was happening inside the container and it can’t gracefully shut this container down.
[FAILED] Failed unmounting /home/anon/mnt/drive.
[ OK ] Stopped Apply Kernel Variables.
[ OK ] Stopped target Network (Pre).
[ OK ] Stopped Initial cloud-init job (pre-networking).
[ OK ] Unmounted /var/lib/lxcfs.
[ OK ] Unmounted /run/user/1001.
[ OK ] Stopped target Swap.
[ OK ] Reached target Unmount All Filesystems.
[ OK ] Stopped target Local File Systems (Pre).
[ OK ] Stopped Create Static Device Nodes in /dev.
[ OK ] Reached target Shutdown.
[ OK ] Reached target Final Step.
Starting Halt...
It appears to be stopped but the container is still listed as running and I see these processes when using lsof on the host.
The whole instance log is 23MB, I can upload it somewhere.
I looked closer in lxd.log and I do see messages like these that are new to me.
109426:t=2020-12-16T16:40:23+0100 lvl=eror msg="Failed to retrieve network information via netlink" instance=anon instanceType=container pid=68729 project=default
109427:t=2020-12-16T16:40:23+0100 lvl=eror msg="Error calling 'lxd forknet" err="Failed to run: /snap/lxd/current/bin/lxd forknet info -- 68729 -1: Failed setns to container network namespace: No such file or directory" instance=anon instanceType=container pid=68729 project=default
I rather not, there might be a lot of customer information in there. If it’s really needed I will first edit out all customer info but it will be a bit of a job.
nsenter -t 68729 -m should let you get into the mount namespace. I’m not sure that it will be particularly helpful though as the fuse filesystem looks gone already.
Can you maybe check for /sys/fs/cgroup/freezer/lxc.payload.anon/freezer.state to see if part of the killing logic may have frozen the container causing part of this issue?
So I’m unsure what to blame here. My hope was that regardless of what a customer did in the container it wouldn’t effect other containers. However it now seems that getting a messed up FUSE mount can do just that since it looks like I will have to reboot the node.
When killing the lxc monitor process I at least could start the container up again but I’m unsure whether this is a good idea since there are still open file descriptors of all the processes that are in D or Z state. They still exist, I also can’t unmount the rbd device and thus not delete the container still.
In the end it seems like systemd might be to blame for waiting indefinitely, should I raise this issue with them?
@brauner any idea? I know there are fuse and freezer interactions that can lead to deadlocks but this isn’t it and feels like a potential kernel bug which could maybe turn into a security bug.