Weekly status for the week of the 1st of March to the 7th of March.
Introduction
The highlight of the past week was the release of LXD 4.12 which includes most of changes from the past week. Please take a look at the release notes for more information.
LXD
Most of the past week has been spent fixing bugs and making improvements ready for the 4.12 release. However there have been several new notable features added:
- Added support for using
#external
and#internal
subjects in network ACL rules forsource
anddestination
fields to match on external traffic (that which goes via network’s router port) and internal traffic (that which goes via another instance NIC connected to the network) respectively. - Added support for using multiple GPUs with SR-IOV.
- Added support for a new
restricted.cluster.target
config key to projects which prevents the user from using the--target
flag. This prevents them from specifying what cluster member to place a workload on or the ability to move a workload between members.
In terms of improvements and changes the following are noteworthy:
- LXD now no longer allows the use of
%
characters in NIC interface names. Previously this allowed the kernel to randomly generate a NIC interface name. However this prevented LXD was knowing what interface name was generated, and so it is no longer allowed. - Related to that we now validate that duplicate NIC interface names are not configured for an Instance. Previously this would have caused a lower level liblxc error preventing container start, but now will fail with a clearer error message.
- When using SR-IOV NICs, we select a free Virtual Function (VF) for passing into the instance. We have changed the criteria for selecting a free VF and now no longer consider a VF interface that is UP or has IPs configured on it as eligible for selection (as it could indicate that another application outside of LXD is using the VF interface).
Work has also continued on migrating our API docs to the automatic generation using swagger.
We have also fixed several bugs:
- When adding and re-adding a cluster member this could cause problems due to the cluster member ID being re-used in the LXD database which the underlying Raft library was not expecting. This has now been fixed to set the
id
column to AUTOINCREMENT so that IDs are not reused if a cluster member is removed and a new one added. - When using network ACLs with OVN inside the non-default project, a bug in the usage detection logic caused ACLs referenced by other ACLs to be incorrectly detected as not in use and removed from OVN.
- An issue with vsock (used to communicate with
lxd-agent
running in VM instances) that intermittent connection timeouts and excessive memory usage has been worked around so that a connection attempt is retried to avoid triggering the underlying issue in the vsock driver.
LXC
There have been several bug fixes and improvements in the past week:
- Reverts a change that failed if there were no writable cgroup hierarchies.
- Handle CLONE_PIDFD on arm64.
- Improve feature detection headers for go-lxc.
Dqlite (RAFT library)
An issue related to the cluster member ID reuse in LXD (discussed above) has also been fixed in Dqlite. It now handles the case of a cached raft_id
that was subsequently reused. Where previously a connection would be attempted to the wrong address. This happened when LXD removed a cluster member from the configuration and later added a new one to the configuration, with the same ID and different address. A stale cluster member address is now detected and refreshed.
Youtube channel
We’ve started a Youtube channel with live streams covering LXD releases and its use in the wider ecosystem.
You may want to give it a watch and/or subscribe for more content in the coming weeks.
Contribute to LXD
Ever wanted to contribute to LXD but not sure where to start?
We’ve recently gone through some effort to properly tag issues suitable for new contributors on Github: Easy issues for new contributors
Upcoming events
- Nothing to report this week
Ongoing projects
The list below is feature or refactoring work which will span several weeks/months and can’t be tied directly to a single Github issue or pull request.
- Distrobuilder Windows support
- Virtual networks in LXD
- Various kernel work
- Stable release work for LXC, LXCFS and LXD
Upstream changes
The items listed below are highlights of the work which happened upstream over the past week and which will be included in the next release.
LXD
- Improve remote image updates in clusters
- Network: Adds OVN #internal/#external port subjects
- doc/README: Drop readthedocs
- Network: Further restrict SR-IOV free VF rules to require VF to be down and have no IPs configured
- lxc/remote: Tweak output
- Extend swagger coverage (events, profiles, projects, networks)
- Support multiple GPUs for SR-IOV
- Network: Validate NIC interface name and prevent duplicates
- More swagger endpoints
- Projects restricted cluster target
- Network: Don’t fail when missing vfListPath in sriovGetFreeVFInterface
- lxd/vsock: Better handle errors
- docs: typo on JSON schema
- lxd/vsock: Retry timeouts once
- lxd/db: Set nodes.id to auto-increment for new clusters
- lxd/images: Properly spread replicated images
- Patches: Adds db_nodes_autoinc patch
- FIx issues with ceph.rbd.features
- Storage: Ceph utils util.SplitNTrimSpace usage
- Network: Use OVN TCP flag constants for OVN ACL baseline rules
- LXD: Switch to GetStableRandomGenerator helper function where FNV-1a stable random numbers are generated
- shared/api/netork/acl: Adds missing example doc fields
- Network: Fix UsedBy with project profiles
- shared/api: Mark most ACL rule fields omitempty
- test/suites: Fix sed command
- Fix typo in doc/projects.md, replace images with backups
- Fix a typo in rest-api.md for renaming a network ACL
LXC
- cgroup: do not fail if there are no writable heirarchies
- attach_options: header improvements
- start: handle CLONE_PIDFD on arm64
LXCFS
- Nothing to report this week
Distrobuilder
- Nothing to report this week
Dqlite (RAFT library)
Dqlite (database)
- Nothing to report this week
Dqlite (Go bindings)
- Nothing to report this week
Distribution work
This section is used to track the work done in downstream Linux distributions to ship the latest LXC, LXD and LXCFS as well as work to get various software to work properly inside containers.
Ubuntu
- Nothing to report this week
Snap
- libtpms: Bump to 0.7.7
- libnvidia-container: Bump to 1.3.3
- openvswitch: Bump to 2.15.0
- zfs: Bump to 2.0.3
- ovn: Bump to 20.12.0
- edk2: Bump to 202011
- lxd: Bump to 4.12
- lxd: Cherry-pick upstream bugfixes