Repeatable LXD installations

Hi everyone,

I am working with @yang on a project that aims to provide a simple automated way to install and use LXD using Puppet for automation, both at the server level and instance declaration level.

We will soon release all the source code and documentation to the community, including some improvements of LXDUI that makes it usable for simple operations on standalone servers or LXD clusters.

One challenge that we are facing is the ability to target tried and tested LXD versions. If I understood correctly snaps are only available for the past 2 versions, which means that in some weeks we will probably not be able to install the specific version that has been tested and is known to work.

Even though upgrading to the latest version is good in the long run, upgrading on short cycles increases risk and unpredictability.

Question: can some thing be done to enable repeatable LXD installations over a reasonable time frame? (ex: 12 months? 18 months?).

Perhaps keeping snaps for a longer time would be a simple way?

Thanks for the support the has been given to @yang on the multiple questions that were posted here.

Best regards

Cant you just target the LTS release channel?

During the system implementation we have used new features from other LXD versions that are not available on the LTS release version.

Targetting the LTS channel 4.0/stable is probably the best approach if you are looking for stability.

The feature release channel latest/stable includes new features, but can also contain non-backward compatible changes (although we do try and avoid breaking the API). Each individual release is only available for a short time in the specific snap channel so that people don’t end up installing an old and unsupported version (which is effectively what you’re proposing).

Would it be possible to provide a channel for previous feature release versions of LXD?

We had a practical instance in which we upgraded to version 4.7 to take advantage of VM memory shrinking but on later server installations had to fix an event listener script due to instance life cycle event changes in 4.9 for example.

You can install a specific current version and stay on it see Pinned Feature Channels in the afore mentioned guide about snap management. However for the same reasons we don’t allow new installs of old versions.

As things currently stand, we’d rather not keep those old channels open after we stop supporting those releases. This gets particularly problematic when we need to deal with security updates.

If there was a way to just hide a channel or to mark a channel as read-only and deprecated (showing a suitable warning/confirmation on install), then we could do that. But as snapd doesn’t offer such facilities at the moment, we intend to stick to only guaranteeing that the current and past feature releases will be available, along with your choice of LTS releases.

2 Likes

Hi everyone,

Thanks for taking to time to help us, once again. I have been reviewing the “Managing the LXD snap” document hoping to understand the best course of action.

If I understand correctly, in order to install a specific version and keep it forever we need:

  • to determine which is the latest version (let’s call it X)
  • to execute
sudo snap install lxd --channel=[X-1]/stable

By installing the release before the latest we ensure it will never be updated (if we did this with the latest it would get updated at some point according to the document).

If X=2.0 or 3.0 or 4.0 we would be able to do this as many times as necessary, but since we need lxd >= 4.9 we are unable to target a specific version for several installations that would get done in the course of months, because as new versions are released the older ones are deleted.

Am I correct?

Assuming I am correct in interpreting the document this would mean that the next opportunity to target a specific release would be when 5.1 is available. When 5.1 is available, doing:

sudo snap install lxd --channel=5.0/stable

would install 5.0 in such a way that it would not get updated without explicit action from us. But at that point, if a bug was found in 5.0 that would be fixed, say in 5.2, we would again be unable to target a specific version, until 6.1.

So, if my understanding is correct we need to forget the idea of targeting the same version for the different installations that we will do during the course of a year. We will be able to keep a certain version fixed on each installation with:

sudo snap install lxd --channel=[X-1]/stable

where X is the latest version, but we will need to do acceptance tests on version X-1 for every installation.

I kindly ask you to confirm if this is the correct interpretation of the document.

Thanks again.

Not quite.

There are two types of LXD releases: feature and LTS.
The feature releases are approximately once a month and are taken from the main branch and given a release number like 4.x (e.g. 4.16, 4.17 etc).
The LXD team also maintaine several LTS series (currently 4.0, 3.0 and 2.0) for 5 years, each series receiving less frequent periodic releases (e.g. 4.0.1, 4.0.2 etc).

You can see these here:

These release are then packaged into a snap and are made available from the snap store via “channels”.
These channels include the release, plus additional cherry-picks for fixes that occur between releases.

The following channels are available:

  • latest/stable - this is the latest feature release (4.x) plus any interim cherry-pick fixes and will automatically update to the next feature release.
  • 4.0/stable - this is the latest 4.0.x LTS release plus any interim cherry-pick fixes and will automatically update to the next LTS point release in the series.
  • 4.x/stable - this is the only available for the last 2 feature releases (4.x), so currently that is 4.16/stable and 4.17/stable. These channels receive cherry-pick fixes until the next feature release is made, at which point they are never updated (no security or bug fixes). However releases older than the last 2 feature release are not available for new installs. So you can’t, for instance, install 4.15/stable now as that no longer exists for new installs, however if you had previously installed using the 4.15/stable channel your system would remain on that release until the channel was manually changed.

The channels available are in the drop down on the top right here Install LXD on Linux | Snap Store

Because the LXD daemon is run as root it is important to keep it up to date, either via the LTS channel or via the feature channel. This is why we are keen not to have people pinned on unsupported releases.

Hi @tomp ,

Thank you again. If I understand you, according to your answer:

  • if we need to install 6 independent LXD systems in the course of a year (ex: 1 every 2 months) we either go for 4.0/stable (which has stoppers) or all of them will have 4.X with a different X.
  • even if we went for 4.0/stable it would be automatically updated to 5.0, once released, putting production systems at risk

So either we have 4.0, which has stopper issues and will automatically upgrade do a newer LTS (big jump), or we have a collection of slightly different 4.X systems to maintain. None of the sistuations seems ideal.

From a IaC (Infrastructure as Code) perspective it is critical to to have:

  1. repeatability: the ability to install identical systems over time
  2. configuration management: the ability to perform changes on a controlled manner (including planned cycles for security updates)
  3. traceability: the ability to audit those changes over time

This would be possible if a reasonable number of previous releases was kept in the archive but seems impossible otherwise.

We could say that the same happens for any debian package, say, firefox, where a new system install updates to the latest (only if the user chooses to perform updates - otherwise the version at the ISO remains). But unlike the case of firefox where each package update affects a single machine and those machines can be grouped in update batches (update group, get feedback, update another group, 
), updating lxd could affect hundreds of containers and VMs.

So the

security risk
versus
server instability risk from not well enough tested version

tradeoff expressed in this sentence

Because the LXD daemon is run as root it is important to keep it up to date, either via the LTS >channel or via the feature channel. This is why we are keen not to have people pinned on >unsupported releases.

is a bit hard for me to work out. OPS engineers know that they have to scan security notices and perform security updates on a controlled manner.

For example, if the LXD daemon is only acessible via SSH sessions, which are only possible from a private network is the exposure worth the loss of repeatability, config management and traceability?

No that is not correct.

The 4.0/stable LTS channel will only track the 4.0.x LTS series of releases, it will not automatically switch to 5.0 LTS. This is similar to how apt will get security and bug fix updates in an LTS release of an OS.

The latest/stable channel will track the latest feature release.

The 4.x/stable feature release channels will be pinned at a specific version, but the channels themselves are only available for a short period of time (for the next 2 feature releases). We don’t really recommend to use this, but it is there for those who want to install a feature release and handle their own updates but don’t want to use one of the recommended ways to pin to a specific snap version.

Have you considered using one of the other mechanisms snap provides for controlling rollout of versions?

Namely Cohorts Pinning, and Snap Store Proxy, which are described in the afore mentioned guide:

It sounds like one of those would achieve what you want by pinning to a snap revision and controlling when that gets deployed (which is subtly different from the original question about targetting a specific LXD version).

And of course you can also take the upstream release tarball and package LXD as you need, which will then put you in complete control of when it is applied (this is what some distributions do).

Thanks for the clarifications regarding 4.0/stable not being updated to 5.0. I had misunderstood that.

From what I have seen cohorts limits the number of updates and the snap store proxy seems like another service maintain
 all this to work around the fact that snap performs automatic updates whether people want them or not :slight_smile: Seems a bit contra-natura.

From what I know, OPS teams want to do the updates when the risk/benefit of the circumstances tells them to. That is what is done with apt, and the criteria depends on the specific projects and their trafeoffs.

I understand part of the problem comes from how snap works and is not at all lxd specific. All that lxd could do would be leaving more channels available.

We can as well build and package the upstream LXD and rebuild everytime we need to do security updates. But that is a negative incentive to updating and a loss of efficiency. All we wanted was being able to control the versions and the moments of updating in a responsible manner :slight_smile:

From the ops point of view, the fact that you’re never forced to the next channel works fine. If you snap install lxd --channel=4.17 today, you can stay that way for years.

You just won’t be able to install NEW systems on 4.17 after 4.19 is released, but that doesn’t affect existing users that want to schedule their upgrades.

Hi @stgraber

I wish that was the case, but it is not what is written in the document. Please look at this image:

According to this document I would have to determine the version before the latest, which in this case would be 4.16.

So to be clear, if you install --channel=4.17/stable you will NEVER be moved to 4.18.
What you will be getting is typically 3-4 tiny bugfix updates which we do to fix regressions and important bugs following a LXD release.

Once 4.18 releases, you won’t ever get another update as we only fix bugs on the current release, not on prior ones.

Thanks for this clarification @stgraber .

One last question: what is the impact of a background update (either bugfix update or release update) of lxd on a production system being used?

Typically it’s a 30s or so downtime of the API. All instances will keep running so there’s no impact on the actual workloads.

Existing API calls like lxc exec will have up to 5min to complete or will be forcefully disconnected (5min is default timeout, it can be configured).

Thank you Stéphane. One other thing popped up. When we target, for example, 4.17/stable and get automatically

3-4 tiny bugfix updates which we do to fix regressions and important bugs following a LXD release

is there a number that changes? (snap revision?)

The snap revision will go up indeed.