Weekly status for the week of the 19th of April to the 25th of April.
Introduction
The past week has seen mostly bug fixes and stability improvements for LXD, however LXC has gained initial support for idmapped mounts, and distrobuilder now supports OpenWrt on ARM.
The LXD team is hiring
Canonical Ltd. is expanding its investment into LXD with a total of 5 additional roles.
The primary focus of this effort is around scalability and clustering as well as developing compelling solutions using LXD for our customers.
All LXD positions are 100% remote with some travel for internal events and conferences.
LXD
This past week has seen fixes for a couple of regressions in 4.13:
- When using
ipv{n}.address=none
withbridge
networks, theipv{n}.firewall
setting was not automatically disabled, as it was in 4.12. This has now been restored to its original behaviour of implicitly disabling the firewall when the bridge has no IP address (with the exception ofbridge.mode=fan
where that setting itself automatically generatesipv4.address
, and thus implicitly enablesipv4.firewall
if not specified). - Custom volume scheduled snapshots had been accidentally disabled. This has now been re-enabled.
Whilst fixing the scheduled custom volume snapshot feature, we also found and fixed some other semi-related issues:
- We now ensure that local custom volumes are always attempted on the local cluster member, as this wasn’t the case, and could result in the wrong cluster member being used to attempt the snapshot, which would either fail or only snapshot its local volume.
- The automatically generated snapshot names (
snap0
,snap1
etc) now take custom volume locality into consideration, such that if two different local volumes of the same name exist on different cluster members, the snapshot names are incremented correctly for each volume rather than considering the snapshot names of the non-local volumes. - It was noticed that when deciding which cluster member to initiate the volume snapshot on for remote filesystems (ceph/cephfs) the method being used to get a list of online cluster members was actually triggering a cluster heartbeat. This was unnecessary as the current member list along with last-heartbeat time is stored on each member in its local database. So there was no need to go to the expense of triggering a cluster wide heartbeat. We also found a few other places where this was happening unnecessarily (such as
lxc cluster list
) and these have also been optimized to use the local member state record. - The
lxc storage volume set
andlxc storage volume snapshot
commands now accept the--target
flag so you can specify a particular cluster member’s local volume. - If a remote storage pool (ceph/cephfs) was attempted to be deleted while not all of the cluster members were online, it was possible to end up in a situation where the storage pool itself was removed from the storage system, without all of the per-member local cleanup occurring. This would then result in orphaned database and directories being left in LXD, and making it impossible to remove using
lxc storage delete
. We now check that all cluster members are online before allowing a storage pool to be removed. - An issue with specifying multiple snapshot schedules due to confusion with the way that comma delimited schedules were mixed up with comma delimited cron specifications has been fixed. We now differentiate between
,
and,
(comma then space), the former only being used with cron, and the latter being used for specifying multiple cron schedules.
A user experience improvement has been added to routed
NIC type to prevent accidentally specifying auto
for the ipv{n}.gateway
setting on more than one NIC concurrently. Previously this resulted in a liblxc error, but now we validate it before starting the container and give a clear error message.
For virtual machines, we have switch to using the -spice
flag rather than using the associated config file section for enabling remote screen output due to changes in the upstream qemu project’s requirements. We have also moved to using query-cpus-fast
QMP command for the same reason.
The groundwork for forthcoming cluster certificate changes has been laid, we now always generated a fresh certificate for cluster API usage when bootstrapping a new cluster, rather than reusing the first member’s server certificate. This is important as in the future we will each cluster member’s server certificate for authenticating intra-cluster communication, so its important that each member’s certificate is only used for its own uses and not for the cluster wide API endpoints.
LXC
Initial support for idmapped mounts has been added using the lxc.rootfs.options = idmap=/path/to/user/ns/to/idmap/to
configuration option.
Distrobuilder
We now support OpenWrt on ARM architectures.
A fix for systemd capabilities inside containers has been added to avoid issues with distributions that have systemd units that enable these features which do not work properly inside containers.
This creates a system wide unit that specifies:
ProtectProc=default
ProtectControlGroups=no
Dqlite (RAFT library)
An issue that was causing intermittent latency spikes in response times has been fixed by continuing to allow requests during snapshots.
Dqlite (database)
Several issues with unix sockets on MacOS have been fixed.
Dqlite (Go bindings)
We now build with TLS support on MacOS.
Youtube channel
We’ve started a Youtube channel with live streams covering LXD releases and its use in the wider ecosystem.
You may want to give it a watch and/or subscribe for more content in the coming weeks.
Contribute to LXD
Ever wanted to contribute to LXD but not sure where to start?
We’ve recently gone through some effort to properly tag issues suitable for new contributors on Github: Easy issues for new contributors
Upcoming events
- Nothing to report this week
Ongoing projects
The list below is feature or refactoring work which will span several weeks/months and can’t be tied directly to a single Github issue or pull request.
- Distrobuilder Windows support
- Virtual networks in LXD
- Various kernel work
- Stable release work for LXC, LXCFS and LXD
Upstream changes
The items listed below are highlights of the work which happened upstream over the past week and which will be included in the next release.
LXD
- lxd/lxd: Prevent multiple routed NIC devices from using “auto” gateway mode
- Cluster: Generate a new cluster certificate when bootstrapping a cluster
- lxc/remote: Only update URL in set-url
- lxd/instance/drivers: Don’t overwrite template triggers
- Images: Removes unnecessary imagesDownloadingLock mutex
- lxc: Fix help for string arguments
- Replace cluster.List with GetNodes
- Storage: Fix auto custom volume snapshots
- Rename /internal/image-refresh to /internal/testing/image-refresh
- Assorted bugfixes
- vm/qemu: configure spice using -spice parameter
- Cluster: Introduce server trusted certificate type
- tests: Removes use of -v flag for nc inside busybox
- Network: Don’t attempt to setup bridge ipv6 firewall when no ipv6.address
- lxd/swagger: Add NotFound response
- lxd/snapshots: Fix multiple schedules
- lxd/images: Ignore intervals on manual refreshes
LXC
- Initial support for idmapped mounts
- ci: an attempt to run the tests under ASan/UBsan
- Revert “ci: get around https://github.com/lxc/lxc/issues/3796”
- ci: make use of --enable-sanitizers instead of CFLAGS
- include fixes for Bionic
- getsubopt: use correct include
- mntopt fixes
- seccomp: init and destroy notifier.cookie
- dir: fix rootfs mounting
- configure: fix function detection
LXCFS
- Nothing to report this week
Distrobuilder
Dqlite (RAFT library)
Dqlite (database)
Dqlite (Go bindings)
Distribution work
This section is used to track the work done in downstream Linux distributions to ship the latest LXC, LXD and LXCFS as well as work to get various software to work properly inside containers.
Ubuntu
- Nothing to report this week