Mount fails when starting container. ZFS filesystem already mounted [LXD 3.22]

stgraber · March 13, 2020, 12:28pm

It’s a LXCFS bug which we’re rushing a fix for now.

LXCFS bugfix: https://github.com/lxc/lxcfs/pull/364
Snap cherry-pick: https://github.com/lxc/lxd-pkg-snap/commit/e498b5f0c7b87a65a2155c06901b1f98d170afcb
Snap builds: https://code.launchpad.net/~ubuntu-lxc/+snap/lxd-latest-candidate/
Jenkins validation: https://jenkins.linuxcontainers.org/job/lxd-test-snap-latest-candidate/

As soon as we have a new build ready, Jenkins will auto-validate it, if that’s green we’ll immediately release to stable. As we’re still in the initial 24h rollout window and only a limited number of users will ever hit this (those with systems that have been up for a while), this should avoid the worst of it.

Unfortunately those affected won’t find themselves fixed when the fix hits.

In all cases, when hitting this issue, you’ll want to check if you have a lxcfs process running on your system. If you don’t, start by running systemctl reload snap.lxd.daemon to get one running again. This is required to have newly started containers behave and to allow restarted containers to also go back to normal.

After that, you effectively have two ways out of this:

Look for grep lxcfs /proc/mounts in every container and unmount (umount) all the matching paths. The container will then behave again but without any of the cpu/memory/load-average/uptime resources being properly reported. Instead you’ll see the host values until such a time as the container is restarted.
Restart all the containers

As a side note, we don’t actually expect anyone who’s been keeping up with critical kernel updates to be able to hit this issue (as they wouldn’t ever reach the uptime needed), so if you’re hitting this, we’d suggest also applying any pending kernel updates to your system and rebooting it while you’re at it. There are important security fixes in those kernels that you definitely want to benefit from.