Weekly status for the week of the 10th of May to the 16th of May.
Introduction
The past week has seen a focus on cluster heartbeat stability improvements in LXD and Dqlite, the continuation of idmapped mount support and fuzzing testing in LXC, and the addition of AlmaLinux support in Distrobuilder.
The LXD team is hiring
Canonical Ltd. is expanding its investment into LXD with a total of 5 additional roles.
The primary focus of this effort is around scalability and clustering as well as developing compelling solutions using LXD for our customers.
All LXD positions are 100% remote with some travel for internal events and conferences.
LXD
This past week continues the work on improving the reliability and predictability of the LXD cluster heartbeat system. A fairly recent change a few weeks ago saw the removal of several code paths that were unintentionally generating cluster wide heartbeats each time an operation was run (such as lxc cluster ls
) to get fresh state. This was undesirable as it would cause unexpected load on the cluster. However after the change we found that some of our automated tests were intermittently failing during cluster member handover testing. The resulting deep-dive to fix these tests resulted in the following changes to LXD cluster heartbeats:
- In order to ensure that when the
cluster.offline_threshold
setting was changed that the member status could be reliably detected aftercluster.offline_threshold * 2
we now dynamically modify the heartbeat interval to becluster.offline_threshold / 2
. - When a cluster raft member is removed we now initiate an immediate re-balance of roles rather than waiting for the next heartbeat. This prevents state data being in the local
raft_nodes
table. - When a cluster member joins it sends a heartbeat to all existing members notifying them of its existence. Unfortunately in conditions where a leader initiated heartbeat round is already in progress, the updated member information was not used and so ended up distribution stale member information to the rest of the cluster. This has been fixed by detecting and re-running a leader initiated heartbeat round if a member notification heartbeat arrives during it. This ensures fresh member data is distributed.
Also cluster related, you can now specify the cluster_certificate_path
setting in the cluster pre-seed information, allowing the certificate to be read from a file, rather than requiring to be embedded inside the pre-seed info.
A updated to our production setup notes has been added covering container name leakage. Please see https://linuxcontainers.org/lxd/docs/master/production-setup#prevent-container-name-leakage.
The ability to control how long LXD waits whe performing a clean shutdown has been added via the core.shutdown_timeout
setting.
A fix for an issue that prevented container deletion after a failed start due to the shiftfs mounts being left over has been added.
A regression in the bridged
network VXLAN tunnel setup that was introduced with the new ip
package has been fixed.
Also network related, a fix has been added to support non-UUID OVS system-id
settings when using OVN networks.
The /dev/lxd
device that is available inside containers now exposes the hostname of the cluster member that it is running on in the Location
field. This will be particularly useful for the guests to be able to pass that to further clusters (like Kubernetes) to avoid putting all of the control services on the same physical host potentially leading to broken high availability.
On the storage front, the ceph
driver has been updated to always return volume usage estimates, even if the instance isn’t mounted.
On the VM front, a fix for the LXD_OVMF_PATH setting when it points to a symlink has been fixed so that AppArmor correctly uses the deferenced path.
Also on the VM front, initial work has been added to use the QMP protocol to add NIC devices. This is part of a larger project to move away from using the Qemu config file, as the Qemu project have soft deprecated it. Using QMP to add the NIC devices was the first step towards this and also lays the ground work for being able to hotplug NIC devices into a running VM. However the work to using QMP on pre-boot to add NICs is looking in doubt as it has been found that Qemu doesn’t apparently respect the bootindex
setting when NICs are added via QMP (even if done before the VM is actually started). Hopefully we can find a resolution for this otherwise we will need to use a two different methods to add NICs at boot time and at hotplug time.
LXC
A couple of regressions have been fixed and improvements to idmapped mount support have been added.
LXCFS
Option parsing has been reworked to align with how LXC handles this.
Distrobuilder
Initial support for AlmaLinux has been added.
Dqlite (RAFT library)
Added a fix that skips pre-vote logic when disrupt_leader
is set. Without this fix a leadership transfer could fail when a HeartBeat
arrives at a node that has just become Candidate
as the result of a raft_transfer
.
Youtube channel
We’ve started a Youtube channel with live streams covering LXD releases and its use in the wider ecosystem.
You may want to give it a watch and/or subscribe for more content in the coming weeks.
Contribute to LXD
Ever wanted to contribute to LXD but not sure where to start?
We’ve recently gone through some effort to properly tag issues suitable for new contributors on Github: Easy issues for new contributors
Upcoming events
- Nothing to report this week
Ongoing projects
The list below is feature or refactoring work which will span several weeks/months and can’t be tied directly to a single Github issue or pull request.
- Distrobuilder Windows support
- Virtual networks in LXD
- Various kernel work
- Stable release work for LXC, LXCFS and LXD
Upstream changes
The items listed below are highlights of the work which happened upstream over the past week and which will be included in the next release.
LXD
- Cluster: Make heartbeat interval half of offline threshold
- Cluster certificate path
- Cluster heartbeat: Logging improvements
- LXC: Indicate that ctrl+c can be used to abort interactive changes that fail validation
- Swagger coverage for backups
- Network: Handle non-uuid OVS system-ids for OVN networks
- Cluster: Cancel ongoing heartbeat and restart round when notification heartbeat comes from non-leader member
- Network: Ensure VXLAN tunnel interface is brought up on bridge network setup
- lxd/instances: Unmount shiftfs on startup failures
- Cluster: Rebalance roles when removing raft member
- Shutdown timeout
- Complete swagger coverage of instances
- lxd/storage/ceph: Always return VolumeUsage
- doc/production-setup: Cover name leakage
- lxd/apparmor/instance: Deref OVMF path
- VM: Switch to QMP for adding NICs
- VM: Remove old pid file on start if exists
- Cluster: Fix heartbeatInterval()
- lxd/instance/qemu: queues is uint64
- Location devlxd api
- lxd/instance/qemu: Support for security.devlxd default (true) value
- doc/environment: Documents LXD_CONF and LXD_GLOBAL_CONF env vars
LXC
- oss-fuzz: add basic cgroup_init()/cgroup_exit() fuzzing
- tests: fix lxc-test-arch-parse for make dist
- confile: convert AppArmor and SELinux confile parsing from errors to …
- cgroups: clean up cgroup_ops on initialization error
- conf: fix containers without rootfs
- start: move idmapped mount setup later
LXCFS
Distrobuilder
Dqlite (RAFT library)
Dqlite (database)
Dqlite (Go bindings)
- Nothing to report this week
Distribution work
This section is used to track the work done in downstream Linux distributions to ship the latest LXC, LXD and LXCFS as well as work to get various software to work properly inside containers.
Ubuntu
- Nothing to report this week
Snap
- lxd: Cherry-pick upstream bugfixes