LXD 3.10 has been released

Introduction

The LXD team is very excited to announce the release of LXD 3.10!

This release introduces snapshot expiry which combined with automated snapshots in LXD 3.8 should make for a nice way to have LXD generate and cleanup snapshots in the background.

We also did some work on our import/export of containers, now allowing overriding the storage pool during import.

This release also fixes a wide variety of bugs and has a number of nice performance improvements around compression/decompression and improved progress reporting thanks to the ChromeOS team at Google.

Enjoy!

New features

Snapshot expiry

A new snapshots.expiry container configuration option now lets you define an expiry for newly created containers. Alternatively, a snapshot can now be directly edited to set the newly introduced Expiry field.

When a snapshot expires, it is automatically deleted. This feature is particularly useful when combined with automated snapshots.

Pool override on import

It is now possible to select what storage pool a container backup should be imported into. On the command line, this can be specified with --storage.

Bugs fixed

  • client: Properly reset listener on error
  • client: Strip trailing slashes in URLs
  • doc: Document btrfs resize
  • doc: Fixed typo in backup.md
  • global: Rename {Creation,LastUsed}Date to {Created,LastUsed}At
  • i18n: Fix duplicate language
  • i18n: Update translations from weblate
  • i18n: Update translation templates
  • lxc/image: Fix help
  • lxd/apparmor: Tweak default set of rules
  • lxd/backups: Don’t waste memory during unpack
  • lxd/backups: Fix fd leak
  • lxd/backups: Handle missing storage pool for backups properly
  • lxd/backups: Send progress info for export and import operations
  • lxd/cluster: Don’t prompt for internal config keys
  • lxd/containers: Always delete container on create error
  • lxd/containers: Call storage unmount on detach
  • lxd/containers: Fix disk limits at creation
  • lxd/containers: Fix error handling for auto-snap
  • lxd/containers: Fix lxc.mount.entry for musl
  • lxd/containers: Refuse refresh on running containers
  • lxd/images: calculate sha256 as image is written
  • lxd/images: change compressFile to take io.Reader and io.Writer
  • lxd/images: Send metadata in CreateImage error importing image
  • lxd/images: Send metadata in CreateImage error response
  • lxd/images: Tar and compress in a combined stream when packing an image
  • lxd/internal: Add internal command to trigger GC
  • lxd/migration: Fix race in abort
  • lxd/migration: Fix sender side errors handling
  • lxd/migration: Handle crashing rsync
  • lxd/storage/ceph: Create custom mountpoints if missing
  • lxd/storage/ceph: Fix validation of CEPH config
  • lxd/storage/ceph: Unmap on unmount
  • lxd/storage/ceph: Unmap volume after creation
  • lxd/storage/lvm: Use right VG name for exports
  • lxd/tasks: Fix possible segfaults in tasks
  • shared: Add support for a ProgressTracker during unpack
  • shared: Progress metadata as a map
  • shared: Properly handle uncompressed tarballs
  • shared/osarch: Add armhfp (centos)
  • storage: Add ioprogress.ProgressTracker field to storage
  • tests: Add more container snapshot tests
  • tests: Delete leftover container
  • tests: Extend backup import tests
  • tests: Fix bad test in clustering
  • tests: Fix bad test in container local pool handling
  • tests: Fix bad test in external_auth
  • tests: Fix bad test in security
  • tests: Fix bad test in sql
  • tests: Fix bad test in storage
  • tests: Fix container leak
  • tests: Fix negative tests in backup.sh
  • tests: Fix negative tests in basic.sh
  • tests: Fix negative tests in clustering.sh
  • tests: Fix negative tests in config.sh
  • tests: Fix negative tests in container_local_cross_pool_handling.sh
  • tests: Fix negative tests in database_update.sh
  • tests: Fix negative tests in devlxd.sh
  • tests: Fix negative tests in external_auth.sh
  • tests: Fix negative tests in idmap.sh
  • tests: Fix negative tests in incremental_copy.sh
  • tests: Fix negative tests in lxc-to-lxd.sh
  • tests: Fix negative tests in migration.sh
  • tests: Fix negative tests in pki.sh
  • tests: Fix negative tests in projects.sh
  • tests: Fix negative tests in remote.sh
  • tests: Fix negative tests in security.sh
  • tests: Fix negative tests in serverconfig.sh
  • tests: Fix negative tests in snapshots.sh
  • tests: Fix negative tests in sql.sh
  • tests: Fix negative tests in storage_driver_ceph.sh
  • tests: Fix negative tests in storage_local_volume_handling.sh
  • tests: Fix negative tests in storage_profiles.sh
  • tests: Fix negative tests in storage.sh
  • tests: Fix negative tests in storage_snapshots.sh
  • tests: Fix negative tests in storage_volume_attach.sh
  • tests: Fix negative tests in template.sh
  • tests: Fix volume list in cluster
  • tests: Fix volume list in projects
  • tests: Tweak fdleak test

Try it for yourself

This new LXD release is already available for you to try on our demo service.

Downloads

The release tarballs can be found on our download page.

7 Likes

Congratulations guys, great work!

1 Like

Great! I’ve been looking forward to the completion of the automatic snapshotting & expiry feature. I must concede however that I don’t understand the documentation of the expiry feature given here: https://lxd.readthedocs.io/en/latest/api-extensions/#snapshot95expiry

In particular, the documentation states: " takes an expression in the form of 1M 2H 3d 4w 5m 6y (1 minute, 2 hours, 3 days, 4 weeks, 5 months, 6 weeks)" [the last one should probably read years, not weeks], but it’s not really said what the values mean.

At first glance I expected it to mean keep all snapshots for the last 1 minute, then cull them to keep only 2 per hour (or 2 hourly snapshots?), then, as they grow older, keep 3 per day, etc. So in essence I expected this feature to work similarly to backintime’s smart remove. But that’s not really what the quoted text above says and I’m not sure that this approach is compatible with setting an expiry time right when the snapshot is taken.

A further example would really be appreciated. Let’s say I configure LXD to do some crazy snapshotting, like one snapshot every minute. What would it mean exactly to set the expiry variable to the given example string “1M 2H 3d 4w 5m 6y”? How many (and which) snapshots will continuously be available if let such a configuration run indefinitely?

Anyway, a big thank you to the whole team! :slight_smile:

Hmm, nevermind, I dug into the source code and found that the lifetime of a snapshot is set to the sum of the different components of the expiry string, e.g. “5d 3w” would simply mean that the snapshots have a lifetime of 5 days + 3 weeks = 26 days from creation time. So I guess in most cases only a single time parameter (M, H, d, etc.) will be used and not some combination.

If you put that string verbatime, then the snapshots will expire in about 6.5 years.
You would normally put something like “2d” (expire when 2 days old), or “12H 1w” (expire when 1 week 12 hours old).

It’s not clear when a snapshot gets created, what name it will get. And ain’t nobody has time to read the source. 8-].
Let’s do it live.

$ snap info lxd
name:      lxd
summary:   System container manager and API
publisher: Canonicalâś“
contact:   https://github.com/lxc/lxd/issues
license:   unset
description: |
  LXD is a system container manager.
  
  With LXD you can run hundreds of containers of a variety of Linux
  distributions, apply resource limits, pass in directories, USB devices
  or GPUs and setup any network and storage you want.
  
  Pre-made images are available for Ubuntu, Alpine Linux, ArchLinux,
  CentOS, Debian, Fedora, Gentoo, OpenSUSE and more.
  
  LXD is network aware and all interactions go through a simple REST API,
  making it possible to remotely interact with containers on remote
  systems, copying and moving them as you wish.
  
  Want to go big? LXD also has built-in clustering support,
  letting you turn dozens of servers into one big LXD server.
  
  LXD containers are lightweight, secure by default and a great
  alternative to running Linux virtual machines.
  
  Supported options for the LXD snap (snap set lxd [<key>=<value>...]):
   - criu.enable: Enable experimental live-migration support [default=false]
   - daemon.debug: Increases logging to debug level [default=false]
   - daemon.group: Group of users that can interact with LXD [default=lxd]
   - ceph.builtin: Use snap-specific ceph configuration [default=false]
   - openvswitch.builtin: Run a snap-specific OVS daemon [default=false]
  
  LXD documentation can be found at: https://lxd.readthedocs.io
commands:
  - lxd.benchmark
  - lxd.buginfo
  - lxd.check-kernel
  - lxd.lxc
  - lxd
  - lxd.migrate
services:
  lxd.activate: oneshot, enabled, inactive
  lxd.daemon:   simple, enabled, active
snap-id:      J60k4JY0HppjwOjW8dZdYc8obXKxujRu
tracking:     stable
refresh-date: 22 days ago, at 14:06 UTC
channels:
  stable:        3.9         2019-01-18  (9919) 54MB -
  candidate:     3.10        2019-02-08 (10059) 54MB -
  beta:          ↑                                   
  edge:          git-fe0844d 2019-02-08 (10071) 54MB -
  3.0/stable:    3.0.3       2018-11-26  (9663) 53MB -
  3.0/candidate: 3.0.3       2019-01-19  (9942) 53MB -
  3.0/beta:      ↑                                   
  3.0/edge:      git-18c9b88 2019-01-19  (9940) 53MB -
  2.0/stable:    2.0.11      2018-07-30  (8023) 28MB -
  2.0/candidate: 2.0.11      2018-07-27  (8023) 28MB -
  2.0/beta:      ↑                                   
  2.0/edge:      git-c7c4cc8 2018-10-19  (9257) 26MB -
installed:       3.9                     (9919) 54MB -

We are running on the stable channel that has LXD 3.9 but LXD 3.10 is on the candidate channel.
Let’s switch to the candidate channel and refresh in order to get LXD 3.10 installed. Also, make a mental note to switch back to the stable channel by Monday when LXD 3.10 makes it into the stable channel.

$ snap switch lxd  --channel=candidate
"lxd" switched to the "candidate" channel
$ snap refresh
Download snap "lxd" (10059) from channel "candidate"                           |
Stop snap "lxd" services                                                       \
Copy snap "lxd" data                                                           /
Setup snap "lxd" (10059) security profiles                                     /
Start snap "lxd" (10059) services                                              \
lxd (candidate) 3.10 from Canonicalâś“ refreshed
$ 

Sanity check now:

$ lxd --version
3.10
$ lxc --version
3.10

Nice, we are good to go.

First, let’s test with getting LXD to automatically create snapshots.
Here is the documentation,

snapshot_scheduling

This adds support for snapshot scheduling. It introduces three new configuration keys: snapshots.schedule, snapshots.schedule.stopped, and snapshots.pattern. Snapshots can be created automatically up to every minute.

It says that the timer resolution in one minute. That is, we can schedule at least every minute or more. Not less than one minute.

$ lxc launch ubuntu:18.04 mycontainer
$ lxc config set mycontainer snapshots.schedule "1 0 0 0 0 0"
Error: Invalid config: Schedule must be of the form: <minute> <hour> <day-of-month> <month> <day-of-week>
$ lxc config set mycontainer snapshots.schedule "* * * * *"

The above schedule means that it should create a snapshot every minute.
See Crontab.guru - The cron schedule expression editor on how to specify different crontab-style schedules.

Meanwhile, a few minutes have passed.

$ lxc info mycontainer
...
Snapshots:
  snap0 (taken at 2019/02/08 22:32 UTC) (stateless)
  snap1 (taken at 2019/02/08 22:33 UTC) (stateless)
  snap2 (taken at 2019/02/08 22:34 UTC) (stateless)
  snap3 (taken at 2019/02/08 22:35 UTC) (stateless)

So, we got per-minute creation of snapshots. The default pattern snapshots.pattern is apparently snap%d.

Let’s try now to set a name pattern for the snapshots. If you do not put a *%d in the pattern, then you get the first snapshot named as the pattern, then the next is your pattern string + -0, and so on.

$ lxc config set mycontainer snapshots.pattern "mysnapshot-%d"
$ lxc info mycontainer
...
Snapshots:
  snap0 (taken at 2019/02/08 22:32 UTC) (stateless)
  snap1 (taken at 2019/02/08 22:33 UTC) (stateless)
  snap2 (taken at 2019/02/08 22:34 UTC) (stateless)
  snap3 (taken at 2019/02/08 22:35 UTC) (stateless)
  snap4 (taken at 2019/02/08 22:36 UTC) (stateless)
  snap5 (taken at 2019/02/08 22:37 UTC) (stateless)
  snap6 (taken at 2019/02/08 22:38 UTC) (stateless)
  snap7 (taken at 2019/02/08 22:39 UTC) (stateless)
  snap8 (taken at 2019/02/08 22:40 UTC) (stateless)
  snap9 (taken at 2019/02/08 22:41 UTC) (stateless)
  snap10 (taken at 2019/02/08 22:42 UTC) (stateless)
  snap11 (taken at 2019/02/08 22:43 UTC) (stateless)
  mysnapshot-0 (taken at 2019/02/08 22:44 UTC) (stateless)
  mysnapshot-1 (taken at 2019/02/08 22:45 UTC) (stateless)
$ 

A snapshot does not get overwritten, if it exists, LXD moves on to the next number.

Let’s do the expiry now. When a snapshot gets 3 minutes old, it should expire.

$ lxc config set mycontainer snapshots.expiry "3M"
$ 

Let’s wait for a new container to get created. Then,

$ lxc config show mycontainer/mysnapshot-11
architecture: x86_64
config:
  image.architecture: amd64
  image.description: ubuntu 18.04 LTS amd64 (release) (20190131)
  image.label: release
  image.os: ubuntu
  image.release: bionic
  image.serial: "20190131"
  image.version: "18.04"
  snapshots.expiry: 3M
  snapshots.pattern: mysnapshot-%d
  snapshots.schedule: '* * * * *'
  volatile.base_image: b7c4dbea897f09f29474c8597c511b57c3b9c0d6f98dc42f257c64e76fea8c92
  volatile.eth0.hwaddr: 00:16:3e:08:69:5c
  volatile.idmap.base: "0"
  volatile.idmap.next: '[{"Isuid":true,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
  volatile.last_state.idmap: '[{"Isuid":true,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
  volatile.last_state.power: RUNNING
devices: {}
ephemeral: false
profiles:
- default
expires_at: 0001-01-01T00:00:00Z

So, something is wrong and the expires_at is set to the default NULL value (never expire).
I did peek in the source code and did not manage to figure out what’s going on.
Here’s the source (lxd/container_snapshot.go):

        expiry, err := shared.GetSnapshotExpiry(time.Now(), c.LocalConfig()["snapshots.expiry"])
        if err != nil {
                return BadRequest(err)
        }

        snapshot := func(op *operation) error {
                args := db.ContainerArgs{
                        Project:      c.Project(),
                        Architecture: c.Architecture(),
                        Config:       c.LocalConfig(),
                        Ctype:        db.CTypeSnapshot,
                        Devices:      c.LocalDevices(),
                        Ephemeral:    c.IsEphemeral(),
                        Name:         fullName,
                        Profiles:     c.Profiles(),
                        Stateful:     req.Stateful,
                        ExpiryDate:   expiry,
                }

expiry gets the value of the config snapshots.expiry (like “5M” for five minutes).
But later in ExpiryDate: expiry it is placed verbatim into ExpiryDate.
expiry is not a datetime, it’s a duration.
I think ExpiryDate should be something of the sorts addDurationToDateTime(getCurrentDateTime(), expiry).

4 Likes

@stgraber: can you have a look above as to what I am doing wrong and the expiry does not work?

Wow, Simos, thanks for your great answer (and investigation)!

@monstermunchkin can you look at what @simos found?

Thanks for this. I’ve sent a PR which fixes this bug: https://github.com/lxc/lxd/pull/5481

The problem was that I forgot to add the expiry code for scheduled snapshots.

1 Like

Will merge that one, refresh the snap with cherry-picks and then release 3.10 to stable later today.

1 Like

Released to stable for snap users.

1 Like

For those that have been using the candidate 3.10 of LXD, you can switch back to the stable channel.
Let’s see.

$ snap info lxd
name:      lxd
summary:   System container manager and API
publisher: Canonicalâś“
contact:   https://github.com/lxc/lxd/issues
license:   unset
description: |
  **LXD is a system container manager**
  
  With LXD you can run hundreds of containers of a variety of Linux
  distributions, apply resource limits, pass in directories, USB devices
  or GPUs and setup any network and storage you want.
  
  LXD containers are lightweight, secure by default and a great
  alternative to running Linux virtual machines.
  
  
  **Run any Linux distribution you want**
  
  Pre-made images are available for Ubuntu, Alpine Linux, ArchLinux,
  CentOS, Debian, Fedora, Gentoo, OpenSUSE and more.
  
  A full list of available images can be [found
  here](https://images.linuxcontainers.org)
  
  Can't find the distribution you want? It's easy to make your own images
  too, either using our `distrobuilder` tool or by assembling your own image
  tarball by hand.
  
  
  **Containers at scale**
  
  LXD is network aware and all interactions go through a simple REST API,
  making it possible to remotely interact with containers on remote
  systems, copying and moving them as you wish.
  
  Want to go big? LXD also has built-in clustering support,
  letting you turn dozens of servers into one big LXD server.
  
  
  **Configuration options**
  
  Supported options for the LXD snap (`snap set lxd KEY=VALUE`):
   - criu.enable: Enable experimental live-migration support [default=false]
   - daemon.debug: Increases logging to debug level [default=false]
   - daemon.group: Group of users that can interact with LXD [default=lxd]
   - ceph.builtin: Use snap-specific ceph configuration [default=false]
   - openvswitch.builtin: Run a snap-specific OVS daemon [default=false]
  
  [Documentation](https://lxd.readthedocs.io)
commands:
  - lxd.benchmark
  - lxd.buginfo
  - lxd.check-kernel
  - lxd.lxc
  - lxd
  - lxd.migrate
services:
  lxd.activate: oneshot, enabled, inactive
  lxd.daemon:   simple, enabled, active
snap-id:      J60k4JY0HppjwOjW8dZdYc8obXKxujRu
tracking:     candidate
refresh-date: yesterday at 15:26 EET
channels:
  stable:        3.10        2019-02-11 (10102) 54MB -
  candidate:     3.10        2019-02-11 (10102) 54MB -
  beta:          ↑                                   
  edge:          git-b3c10e3 2019-02-13 (10128) 54MB -
  3.0/stable:    3.0.3       2018-11-26  (9663) 53MB -
  3.0/candidate: 3.0.3       2019-01-19  (9942) 53MB -
  3.0/beta:      ↑                                   
  3.0/edge:      git-c0f142d 2019-02-13 (10118) 53MB -
  2.0/stable:    2.0.11      2018-07-30  (8023) 28MB -
  2.0/candidate: 2.0.11      2018-07-27  (8023) 28MB -
  2.0/beta:      ↑                                   
  2.0/edge:      git-c7c4cc8 2018-10-19  (9257) 26MB -
installed:       3.10                   (10102) 54MB -

We are currently tracking the candidate channel.
Here is again what we get from that channel and the stable channel:

channels:
  stable:        3.10        2019-02-11 (10102) 54MB -
  candidate:     3.10        2019-02-11 (10102) 54MB -

It’s the same version and also the same build, 10102.
That means that switching from candidate to stable will be painless; it will not even have to restart LXD because it’s essentially the same thing. Let’s switch.

$ snap switch lxd --channel=stable
"lxd" switched to the "stable" channel

Did I forget to kill the mycontainer test container that I created a few days ago?
The one that would take a snapshot every minute and at the same time remove (due to expiration) and containers that are two minutes old?
Yes, I did forget about it.
Here is how it looks.

$ lxc info mycontainer
...
  mysnapshot-175 (taken at 2019/02/12 13:22 UTC) (stateless)
  mysnapshot-176 (taken at 2019/02/12 13:23 UTC) (stateless)
  mysnapshot-177 (taken at 2019/02/12 13:24 UTC) (stateless)
  mysnapshot-178 (taken at 2019/02/12 13:25 UTC) (stateless)
  mysnapshot-179 (taken at 2019/02/12 13:26 UTC) (stateless)
  mysnapshot-872 (taken at 2019/02/13 10:15 UTC) (stateless)
  mysnapshot-873 (taken at 2019/02/13 10:16 UTC) (stateless)

Then, a minute later.

$ lxc info mycontainer
...
  mysnapshot-175 (taken at 2019/02/12 13:22 UTC) (stateless)
  mysnapshot-176 (taken at 2019/02/12 13:23 UTC) (stateless)
  mysnapshot-177 (taken at 2019/02/12 13:24 UTC) (stateless)
  mysnapshot-178 (taken at 2019/02/12 13:25 UTC) (stateless)
  mysnapshot-179 (taken at 2019/02/12 13:26 UTC) (stateless)
  mysnapshot-873 (taken at 2019/02/13 10:16 UTC) (stateless)
  mysnapshot-874 (taken at 2019/02/13 10:17 UTC) (stateless)

So, what happened?

When a snapshot is automatically created, it gets an expiry datetime. The previous snap was not setting properly the expiry and would give a special fixed NULL-style value of 0001-01-01T00:00:00Z which is interpreted as do not expire. Therefore, those mysnapshot-1xx container where automatically created with the old snap and got the fixed NULL-style expiry and do not expire. They will not expire.

However, when the updated snap package was released, the newly created snapshots would get a proper +2 minute expiration date and would get deleted automatically. That’s why we get only two recent snapshots.

We also learn that if a snapshot is set to expire in two minutes and has an age of exactly two minutes, it obviously is not deleted yet. The snapshot has to have an age older than the expiry but not equal (probably with resolution of a minute).

1 Like