LXD 3.12 has been released

Introduction

The LXD team is very excited to announce the release of LXD 3.12!

This is one of the more feature packed releases and if you are a cluster user, there should be a lot to be happy about!

We have taken a look through all LXD commands and how they work against clusters, improved our APIs where they were lacking and tweaked the commands to give cluster operators a better experience.

But cluster improvements are far from the only thing improved with this LXD releases.

We’ve also finally got shiftfs support! This feature we’ve been planning for well over a year is finally there when combined with a suitable kernel. With this, LXD containers don’t need any slow shifting on initial startup, reducing the filesystem delta and making container creation so much faster!

Lastly, resource reporting was significantly improved, both in the API and the CLI. We now have more details about the CPU topology, especially NUMA for multi-socket systems and are also now exposing GPU configuration.

Enjoy!

New features

Cluster: Aggregated DHCP leases

LXD managed networks that span multiple cluster members now show a unified view of their DHCP leases, showing hostname, MAC, address and the cluster member’s name for each lease.

root@edfu:~# lxc network list-leases lxdfan0
+----------+-------------------+--------------+---------+----------+
| HOSTNAME |    MAC ADDRESS    |  IP ADDRESS  |  TYPE   | LOCATION |
+----------+-------------------+--------------+---------+----------+
| a1       | 00:16:3e:2b:de:8c | 240.31.0.206 | DYNAMIC | edfu     |
+----------+-------------------+--------------+---------+----------+
| a2       | 00:16:3e:01:99:58 | 240.34.0.124 | DYNAMIC | djanet   |
+----------+-------------------+--------------+---------+----------+
| a3       | 00:16:3e:b4:8b:94 | 240.36.0.96  | DYNAMIC | nuturo   |
+----------+-------------------+--------------+---------+----------+
| a4       | 00:16:3e:52:13:2b | 240.31.0.212 | DYNAMIC | edfu     |
+----------+-------------------+--------------+---------+----------+
| a5       | 00:16:3e:45:54:80 | 240.34.0.68  | DYNAMIC | djanet   |
+----------+-------------------+--------------+---------+----------+
| a6       | 00:16:3e:d1:81:e3 | 240.36.0.90  | DYNAMIC | nuturo   |
+----------+-------------------+--------------+---------+----------+

Cluster: Events now show location

Event messages are now all marked with the name of the originating cluster member as their location.

location: edfu
metadata:
  class: task
  created_at: "2019-04-05T04:13:21.212580932Z"
  description: Creating container
  err: ""
  id: 0c8e4a7d-ef7b-41a0-b949-7030f9aa6827
  location: edfu
  may_cancel: false
  metadata: null
  resources:
    containers:
    - /1.0/containers/a10
  status: Running
  status_code: 103
  updated_at: "2019-04-05T04:13:21.212580932Z"
timestamp: "2019-04-05T04:13:21.223834434Z"
type: operation

Additionally LXD will now only forward log messages of importance WARN or higher to other members, keeping the INFO and DBUG messages local to reduce network chatter. This behavior can be changed by starting the LXD daemon in debug mode, at which point all log levels will be broadcasted again.

Cluster: Operations now show location

Another area that now benefits from clear tracking of cluster members is operations, as can be seen in lxc operation list:

root@edfu:~# lxc operation list
+--------------------------------------+-----------+-------------------+---------+------------+----------------------+----------+
|                  ID                  |   TYPE    |    DESCRIPTION    | STATUS  | CANCELABLE |       CREATED        | LOCATION |
+--------------------------------------+-----------+-------------------+---------+------------+----------------------+----------+
| 36c11142-52d8-4c1e-a342-63657096cdec | WEBSOCKET | Executing command | RUNNING | NO         | 2019/04/05 04:19 UTC | edfu     |
+--------------------------------------+-----------+-------------------+---------+------------+----------------------+----------+
| 701175cf-df82-4ef5-8078-a25d83b770b3 | WEBSOCKET | Executing command | RUNNING | NO         | 2019/04/05 04:19 UTC | djanet   |
+--------------------------------------+-----------+-------------------+---------+------------+----------------------+----------+

This now makes it clear what cluster member is busy doing what and should simplify making sure that a system isn’t actively used before performing maintenance on it.

Cluster: Support for --target in more commands

The following commands have now grown support for --target:

  • lxc config edit/get/show/set/unset
  • lxc info [–resources]
  • lxc network info
  • lxc storage info

This makes it possible to configure some member-specific daemon configuration options, query cluster member runtime information and system resources, get detailed network statistics and storage usage.

Shiftfs support

This is a feature we’ve been looking forward to for years and that we are really excited to finally see coming to completion. shiftfs allows for an unprivileged container experience that doesn’t need any shifting of the filesystem, instead having the kernel do it on the fly.

This requires kernel support through the shiftfs filesystem which is currently a custom patchset that will be present in the Ubuntu 19.04 kernel.

LXD automatically detects support for this and will transparently start using it whenever possible.

Kernel features now exported over API

For some time now, LXD has been detecting a number of optional kernel features on startup and would print an overview then. That same information is now exposed over the API and visible in lxc info.

  kernel_features:
    netnsid_getifaddrs: "true"
    shiftfs: "true"
    uevent_injection: "true"
    unpriv_fscaps: "true"

Improved CPU reporting

The server resources API now exposes CPU sockets and NUMA node information, making it easier to do CPU pinning for containers.

root@djanet:~# lxc info --resources --target edfu
CPUs:
  Socket 0:
    Vendor: GenuineIntel
    Name: Intel(R) Xeon(R) CPU           E5430  @ 2.66GHz
    Cores: 4
    Threads: 4
    Frequency: 1999Mhz (max: 2336Mhz)
    NUMA node: 0
  Socket 1:
    Vendor: GenuineIntel
    Name: Intel(R) Xeon(R) CPU           E5430  @ 2.66GHz
    Cores: 4
    Threads: 4
    Frequency: 1999Mhz (max: 2336Mhz)
    NUMA node: 1

Memory:
  Free: 18.37GB
  Used: 557.76MB
  Total: 18.93GB

GPU:
  Vendor: ASPEED Technology, Inc. (1a03)
  Product: ASPEED Graphics Family (2000)
  PCI address: 0000:06:03.0
  Driver: ast (4.15.0-47-generic)
  NUMA node: 0

The output of lxc info --resources has also been tweaked to adapt to the hardware present on the system.

GPU reporting

As you may have noticed in the previous listing, GPUs are now present in the system resources output. Additional information can also be seen for NVIDIA cards:

root@vm10:~# lxc info --resources
CPU:
  Vendor: GenuineIntel
  Name: Intel(R) Xeon(R) CPU E5-2695 v2 @ 2.40GHz
  Cores: 2
  Threads: 4
  Frequency: 2400Mhz
  NUMA node: 0

Memory:
  Free: 8.14GB
  Used: 225.81MB
  Total: 8.36GB

GPUs:
  Card 0:
    Vendor: NVIDIA Corporation (10de)
    Product: GK208B [GeForce GT 730] (1287)
    PCI address: 0000:00:07.0
    Driver: nvidia (418.56)
    NUMA node: 0
    NVIDIA information:
      Architecture: 3.5
      Brand: GeForce
      Model: GeForce GT 730
      CUDA Version: 10.1
      NVRM Version: 418.56
      UUID: GPU-6ddadebd-dafe-2db9-f10f-125719770fd3
  Card 1:
    Vendor: NVIDIA Corporation (10de)
    Product: GK208B [GeForce GT 730] (1287)
    PCI address: 0000:00:09.0
    Driver: nvidia (418.56)
    NUMA node: 0
    NVIDIA information:
      Architecture: 3.5
      Brand: GeForce
      Model: GeForce GT 730
      CUDA Version: 10.1
      NVRM Version: 418.56
      UUID: GPU-253db1df-f725-a174-99d4-a8933288c39e

Snapshot expiry now visible in lxc info

On top of showing when a snapshot was taken, snapshots that have an expiry will now show their expiry in the listing too.

root@djanet:~# lxc info a1
Name: a1
Location: edfu
Remote: unix://
Architecture: x86_64
Created: 2019/04/05 04:07 UTC
Status: Stopped
Type: persistent
Profiles: default
Snapshots:
  snap0 (taken at 2019/04/05 04:20 UTC) (expires at 2019/04/05 05:20 UTC) (stateless)
  snap1 (taken at 2019/04/05 04:50 UTC) (expires at 2019/04/05 05:50 UTC) (stateless)
  snap2 (taken at 2019/04/05 04:55 UTC) (expires at 2019/04/05 05:55 UTC) (stateless)
  snap3 (taken at 2019/04/05 04:52 UTC) (stateless)
  snap4 (taken at 2019/04/05 05:00 UTC) (expires at 2019/04/05 06:00 UTC) (stateless)

Bugs fixed

  • client: Optimize copies on same nodes
  • client: Properly generate events URL
  • doc: Fix typo in api-extensions.md
  • doc: Inform about ZFS pool default compression
  • doc: Introduce volatile.idmap.current
  • doc: Fix typo in faq.md
  • doc: Tweak markdown format in storage.md
  • doc: Update documentation for snapshots.pattern
  • i18n: Update translations from weblate
  • i18n: Update translation templates
  • lxc: Use shared.IsSnapshot
  • lxc/action: skip containers with intended state
  • lxc/config: Use shared.IsSnapshot
  • lxc/launch: Show start progress
  • lxd: Don’t leak netlink fds
  • lxd: Drop initialShiftRootfs and always shift on start
  • lxd/backups: Attempt to delete storage on failure
  • lxd/backups: Cleanup on failure
  • lxd/backups: Re-order checks for backup.yaml
  • lxd/cluster: Export Snapshot function
  • lxd/cluster: Initialize candid on join
  • lxd/cluster: Limit log message forwarding
  • lxd/containers: Cleanup shifting
  • lxd/containers: Cleanup template application
  • lxd/containers: Export container location
  • lxd/containers: Fix crash on refresh of non-existing
  • lxd/containers: Fix owner/mode of container path
  • lxd/containers: Handle mid-remap containers
  • lxd/containers: Properly handle tar shifting
  • lxd/containers: Stop proxy before storage
  • lxd/containers: Use LXC hook version 1
  • lxd/devices: Cleanup GPU structs
  • lxd/devices: Track vendor/product names and driver
  • lxd/images: Don’t keep an in-memory simplestreams cache
  • lxd/internal: Expose raft-snapshot
  • lxd/internal: Have GC endpoint release memory
  • lxd/main_forkproxy: Fix epoll
  • lxd/migration: Shift CRIU files to current map
  • lxd/migration: Fix handling of missing profiles
  • lxd/networks: Bring mtu device up
  • lxd/patches: Fix names of pool volume LVs
  • lxd/resources: Fix bad CPU reporting
  • lxd/response: Simplify SmartError
  • lxd/storage: Make use of shared.IsSnapshot
  • lxd/storage: Remove setUnprivUserACL
  • lxd/storage: Rename ShiftIfNecessary to resetContainerDiskIdmap
  • lxd/storage: Rename shiftRootfs to initialShiftRootfs
  • lxd/storage: Add helper function to get volume snapshots
  • lxd/storage: Fix copying and moving volume snapshots
  • lxd/storage/btrfs: Fix volume copy with snapshots
  • lxd/storage/ceph: Always unmap after use
  • lxd/storage/ceph: Fix copying existing volume snap
  • lxd/storage/ceph: Fix volume copy with snapshots
  • lxd/storage/ceph: Only freeze if needed
  • lxd/storage/dir: Fix volume copy with snapshots
  • lxd/storage/lvm: Fix LV naming
  • lxd/storage/lvm: Fix volume copy with snapshots
  • lxd/storage/lvm: Pass nouuid for xfs backups
  • lxd/storage/zfs: Fix volume copy with snapshots
  • lxd/storage/zfs: Run rename in clean mntns
  • lxd/tasks: Avoid races on startup
  • lxd-p2c: Workaround for broken /proc/self/exe
  • shared: Switch ParseNumberFromFile to simple read
  • shared/api: Drop StoragePool from Resources struct
  • shared/api: Sort ServerEnvironment struct
  • shared/idmap: Use separate uid and gid entries
  • shared/osarch: Add Plamo x86 arch
  • shared/simplestreams: Align JSON struct for images.json
  • shared/simplestreams: Align JSON struct for index.json
  • shared/utils: Do not chown terminal master fd
  • tests: Add volume copy tests
  • tests: Allow up to 15s for container reboot
  • tests: Fix race condition in proxy test
  • tests: Make proxy tests work with shiftfs
  • tests: Make security tests work with shiftfs
  • tests: Remove dead code
  • tests: Update resources test

Try it for yourself

This new LXD release is already available for you to try on our demo service.

Downloads

The release tarballs can be found on our download page.

7 Likes

Snap is available in the candidate channel and is scheduled for promotion to stable on Monday.

2 Likes

What a massive update, congrats!

I don’t understand what shiftfs is, can you explain again what it does? What is “shifting on the filesystem”?

Unprivileged containers (default in LXD) run shifted that means that root in the container (uid 0) is something like uid 165536 on the host and so on for all the uid and gid of the container.

That traditionally needed to be reflected on the filesystem, so on initial startup and whenever the map changed (setting security.privileged or changing security.idmap.isolated), we’d need to go and change every single uid and gid on the filesystem the next time the container started.

On an empty container on a fast SSD, this takes about a second and a half, on a slow mechanical disk, this can take tens of seconds and running this on a container with a lot of files (backup servers or the like) can take hours.

shiftfs lets us avoid all that by acting as a shifting uid/gid overlay filesystem. The data on disk remains owned unshifted (uid 0 in the container is uid 0 on disk) and we instead let the kernel do the translation every time a file is created or its owner is modified.

This completely eliminates the need for the initial shift on startup, saving about a second and a half on SSD systems and tens of seconds on slower storage, it also avoids the costly shifting on some configuration changes and also accelerates publishing a container as an image for the same reason.

Later it will also enable us to have shared disks between containers that are using security.idmap.isolated, something which simply isn’t possible at all without shiftfs translation between multiple maps for us on the fly.

6 Likes

Awesome, thanks for explaining.

Good job! I didn’t see this in the release notes, but I still got one node on candidate which I thought I had changed but I guess not.

So, lxc commands actually give output without spamming logs and locking up completely with version mismatch.

On my release nodes they just show an error state when doing lxc list for example and tab completion of container names is kinda busted but eventually responds as well. Hats off to some great progress. :grinning:

In the resources api did you intentionally swap the cpu sockets vendor and name results ?

Clearly it was wrong before but now it presents a challenge because before using the vendor looked like;

Screenshot%20from%202019-04-06%2011-53-56

And now it looks like;

Screenshot%20from%202019-04-06%2011-54-34

But because it appers no api end point returns the current version of lxd I cant use a different key depending on the version? Could a lxd version string be returned from the /1.0/ end point ?

Indeed this got fixed in this release. If you need to detect a fixed LXD, look for the resources_cpu_socket api extension as the fix landed at the same time as that.

1 Like

3.12 has been released to stable

A reminder that a snap package is automatically updated withing 24 hours from the availability of an update. The timing is somewhat random in order to spread the load of the Snap Store from those updating.
If however you want to get the snap update now, you can run snap refresh lxd.
By doing so, you get lxd updated now (if it did not happen to be updated automatically already).

Here is such an update of LXD. I pressed Enter a few times in order to capture some of the text messages that show the process of the update. Normally, when you perform the update, you will end up seeing at the end only the message lxd 3.12 from Canonical✓ refreshed.

$ snap refresh lxd
Download snap "lxd" (10508) from channel "stable"                                                                                                                                                 3% 7.63MB/s 2.52s
Stop snap "lxd" services                                                                                                                                                                                          
Setup snap "lxd" (10508) security profiles                                                                                                                                                                        
Consider re-refresh of "lxd"                                                                                                                                                                                      
Start snap "lxd" (10508) services                                                                                                                                                                                 
Consider re-refresh of "lxd"                                                                                                                                                                                      
lxd 3.12 from Canonical✓ refreshed
2 Likes