LXD snap switching to socket activation

Introduction

On Monday (1st of October), we will be switching the LXD snap to using socket activation.
This should be a completely transparent change for most users and will mean that LXD will only be running on your system if you’re actually using it.

We’re doing this change to avoid wasting valuable CPU time on systems which aren’t actually using LXD.
This is also a requirement as Ubuntu 18.10 ships LXD as a snap in all cloud images and we certainly don’t want it running there if the user isn’t using it.

Startup conditions

The LXD daemon will now startup when:

  • You first talk to it
  • You have auto-started containers
  • You are using LXD clustering
  • Your LXD daemon is bound to a network address

This matches the behavior we’ve long had with the LXD deb.

Known issues

Unfortunately socket activation through snapd is currently broken on Fedora.
We have reported this issue here: https://forum.snapcraft.io/t/selinux-blocking-socket-activation-on-fedora/6931
But so far no fixes have been put in place and so unfortunately, the switch to socket activation will break those Fedora users that have SELinux enabled.

As far as I know, the only way out of this currently is to turn off SELinux on your system.
If you have SELinux knowledge and would like to help fixing this, please follow the link above and help the snapd upstream and packagers in fixing this properly.

Early testing

We have had socket activation enabled in our edge channels for the past few months and so are pretty confident that it works, at least on our test systems and for those users that are using the edge channel.

If you have a spare system and VM, it’d be very helpful if you could install the edge snap on there and confirm that things are working well for you:

  • snap install lxd --edge
  • lxc info
  • reboot
  • lxc info

The commands above are the bare minimum to confirm that things are working.
You may also want to check whether your containers come back online after reboot and play with clustering and other advanced features.

Feedback

Should you run into issues, please get in touch with us through one of:

Thanks!

As this is a candidate for releasing on Monday, is there a reason you’re using edge and not candidate (or even beta) channel?

@sparkiegeek opening the beta channel is pretty costly in terms of both build time (~2 hours) and CI time for us and we’re still using the candidate for other bugfixes that will be hitting stable over the weekend.

We will likely be pushing the commits to candidate on Sunday so we can have it build and pass CI ahead of the Monday release, that should give us an almost 24h window for those few users on candidate to report anything wrong Monday morning before we go and flip the switch in the afternoon.

According to store metrics, we have only around 100 users on candidate and I personally count for a good 20-25 of those! edge is also barely used but still accounts for over 3 times more users than candidate.

What we’d really need to deal with this smoothly is a way to control ramping up of a particular rev in stable, that way we could expose 1% of our stable users to it, then 5%, then 10%, … Right now, even with the random distribution of snap updates, we pretty much have a choice of exposing a few hundreds (edge/candidate) or several thousands at once which isn’t ideal.

The candidate channel for the latest track now has socket activation enabled.
This will get promoted to stable tomorrow.

And after we did a bunch more upgrade tests from candidate to stable, this has now been released to the stable channel.

1 Like