Weekly status #195


Weekly status for the week of the 19th of April to the 25th of April.

Introduction

The past week has seen mostly bug fixes and stability improvements for LXD, however LXC has gained initial support for idmapped mounts, and distrobuilder now supports OpenWrt on ARM.

The LXD team is hiring

Canonical Ltd. is expanding its investment into LXD with a total of 5 additional roles.
The primary focus of this effort is around scalability and clustering as well as developing compelling solutions using LXD for our customers.

All LXD positions are 100% remote with some travel for internal events and conferences.

LXD

This past week has seen fixes for a couple of regressions in 4.13:

  • When using ipv{n}.address=none with bridge networks, the ipv{n}.firewall setting was not automatically disabled, as it was in 4.12. This has now been restored to its original behaviour of implicitly disabling the firewall when the bridge has no IP address (with the exception of bridge.mode=fan where that setting itself automatically generates ipv4.address, and thus implicitly enables ipv4.firewall if not specified).
  • Custom volume scheduled snapshots had been accidentally disabled. This has now been re-enabled.

Whilst fixing the scheduled custom volume snapshot feature, we also found and fixed some other semi-related issues:

  • We now ensure that local custom volumes are always attempted on the local cluster member, as this wasn’t the case, and could result in the wrong cluster member being used to attempt the snapshot, which would either fail or only snapshot its local volume.
  • The automatically generated snapshot names (snap0, snap1 etc) now take custom volume locality into consideration, such that if two different local volumes of the same name exist on different cluster members, the snapshot names are incremented correctly for each volume rather than considering the snapshot names of the non-local volumes.
  • It was noticed that when deciding which cluster member to initiate the volume snapshot on for remote filesystems (ceph/cephfs) the method being used to get a list of online cluster members was actually triggering a cluster heartbeat. This was unnecessary as the current member list along with last-heartbeat time is stored on each member in its local database. So there was no need to go to the expense of triggering a cluster wide heartbeat. We also found a few other places where this was happening unnecessarily (such as lxc cluster list) and these have also been optimized to use the local member state record.
  • The lxc storage volume set and lxc storage volume snapshot commands now accept the --target flag so you can specify a particular cluster member’s local volume.
  • If a remote storage pool (ceph/cephfs) was attempted to be deleted while not all of the cluster members were online, it was possible to end up in a situation where the storage pool itself was removed from the storage system, without all of the per-member local cleanup occurring. This would then result in orphaned database and directories being left in LXD, and making it impossible to remove using lxc storage delete. We now check that all cluster members are online before allowing a storage pool to be removed.
  • An issue with specifying multiple snapshot schedules due to confusion with the way that comma delimited schedules were mixed up with comma delimited cron specifications has been fixed. We now differentiate between , and , (comma then space), the former only being used with cron, and the latter being used for specifying multiple cron schedules.

A user experience improvement has been added to routed NIC type to prevent accidentally specifying auto for the ipv{n}.gateway setting on more than one NIC concurrently. Previously this resulted in a liblxc error, but now we validate it before starting the container and give a clear error message.

For virtual machines, we have switch to using the -spice flag rather than using the associated config file section for enabling remote screen output due to changes in the upstream qemu project’s requirements. We have also moved to using query-cpus-fast QMP command for the same reason.

The groundwork for forthcoming cluster certificate changes has been laid, we now always generated a fresh certificate for cluster API usage when bootstrapping a new cluster, rather than reusing the first member’s server certificate. This is important as in the future we will each cluster member’s server certificate for authenticating intra-cluster communication, so its important that each member’s certificate is only used for its own uses and not for the cluster wide API endpoints.

LXC

Initial support for idmapped mounts has been added using the lxc.rootfs.options = idmap=/path/to/user/ns/to/idmap/to configuration option.

Distrobuilder

We now support OpenWrt on ARM architectures.

A fix for systemd capabilities inside containers has been added to avoid issues with distributions that have systemd units that enable these features which do not work properly inside containers.

This creates a system wide unit that specifies:

ProtectProc=default
ProtectControlGroups=no

Dqlite (RAFT library)

An issue that was causing intermittent latency spikes in response times has been fixed by continuing to allow requests during snapshots.

Dqlite (database)

Several issues with unix sockets on MacOS have been fixed.

Dqlite (Go bindings)

We now build with TLS support on MacOS.

Youtube channel

We’ve started a Youtube channel with live streams covering LXD releases and its use in the wider ecosystem.

You may want to give it a watch and/or subscribe for more content in the coming weeks.

Contribute to LXD

Ever wanted to contribute to LXD but not sure where to start?
We’ve recently gone through some effort to properly tag issues suitable for new contributors on Github: Easy issues for new contributors

Upcoming events

  • Nothing to report this week

Ongoing projects

The list below is feature or refactoring work which will span several weeks/months and can’t be tied directly to a single Github issue or pull request.

  • Distrobuilder Windows support
  • Virtual networks in LXD
  • Various kernel work
  • Stable release work for LXC, LXCFS and LXD

Upstream changes

The items listed below are highlights of the work which happened upstream over the past week and which will be included in the next release.

LXD

LXC

LXCFS

  • Nothing to report this week

Distrobuilder

Dqlite (RAFT library)

Dqlite (database)

Dqlite (Go bindings)

Distribution work

This section is used to track the work done in downstream Linux distributions to ship the latest LXC, LXD and LXCFS as well as work to get various software to work properly inside containers.

Ubuntu

  • Nothing to report this week

Snap