Weekly status for the week of the 13th December to 19th of December.
Introduction
This past week was predominately a bug fix and improvement week as we ironed out some issues in the LXD 4.21 release from last week and continued with our roadmap work. This included landing one item from our roadmap, which was agentless VM metrics support.
LXD
New features:
- Adds ability to disable LXD’s collection of metrics and state info from the VM lxd-agent process using the
security.agent.metric=false
instance setting. This is useful for environments where the guest cannot be trusted and where having potentially inflated values from QEMU is preferable to the more detailed values the agent would otherwise provide.
Improvements:
- Don’t depend on existence of dnsmasq.pid file to indicate that dnsmasq is configured to run and so LXD should write static DHCP allocation file. Avoids scenarios where if dnsmasq hasn’t started for some reason, a static IP address allocation config file may not be written.
- Schema change to make all
description
fields non-nullable, this avoids some historical inconsistencies in our schema where some tables has nullable description fields. - Removes a panic statement when upgrading cluster members without roles fail due to unexpected missing ID.
- Removes unused table views.
- Switch to gobgp v3 and drop old protobuf package.
- Don’t allow use of
lxd cluster edit
when server is not clustered. - Enable keepalive and TCP user timeouts on migration connections, so that if there is a network problem the migration connections won’t hang indefinitely.
- Allow users to reference specified arguments in
lxc alias
. - Change notification heartbeats to be full-state, and tighten up what each server member will accept in terms of heartbeat types (depending on whether they are leader or not). This lays the ground work for cluster member role change notifications (needed for event hub project).
- Cancel cluster heartbeat quicker if notification heartbeat arrives during an ongoing round. Previously the ongoing heartbeat wouldn’t cancel until the spread wait had completed, meaning that additional servers members would be informed of potentially stale data before the heartbeat ended prematurely.
Bug fixes:
- Several fixes and improvements to cluster member group handling.
- Fixed a regression with the
routed
NIC device type. Don’t attempt to add an automatic default gateway if there are no IPs specified for the particular IP family. - Fixed regression with the
disk
device type that prevented passing in a unix socket file. Since adding support for restricted disk paths, LXD had started opening the source path to get the file descriptor to pass to the instance. This was incorrectly usingopen
oropenat2
syscall without passing theO_PATH
flag that would allow opening a unix socket file to just get the file descriptor. - On LXD startup don’t attempt to clean partially unpacked image downloads if the storage pool supports shared storage (ceph), as this was causing unpack problems when using the same storage volume on multiple machines concurrently.
- Fixed a bug with
lxc ls
when using--all-projects
flag when there existed multiple instances with the same name in different projects. This would cause some of them to not be shown due to the list data structure incorrectly being keyed on instance name. - Return error if no raft role found in LXD cluster recover.
LXCFS
- tree-wide: use PRIu64 to print uint64_t.
Distrobuilder:
- Updated alt image to support armhf.
Dqlite (RAFT library)
- Fixed an issue to ensure there are entries to append in replication, which is a potential fix for ab issue preventing a member to join as stand-by.
LXD Charm
- Remove lxd_trust_add’s
autoremove
argument as it is unused.
Youtube channel
We’ve started a Youtube channel with live streams covering LXD releases and its use in the wider ecosystem.
You may want to give it a watch and/or subscribe for more content in the coming weeks.
Contribute to LXD
Ever wanted to contribute to LXD but not sure where to start?
We’ve recently gone through some effort to properly tag issues suitable for new contributors on Github: Easy issues for new contributors
Upcoming events
- Nothing to report this week
Ongoing projects
The list below is feature or refactoring work which will span several weeks/months and can’t be tied directly to a single Github issue or pull request.
- Distrobuilder Windows support
- Virtual networks in LXD
- Various kernel work
- Stable release work for LXC, LXCFS and LXD
Upstream changes
The items listed below are highlights of the work which happened upstream over the past week and which will be included in the next release.
LXD
- Clean up index page
- Generator: Use api.NewURL for URL generation
- API get Instance with more in-depth information
- Agent-less VM metrics
- doc: use customized Furo theme
- Cluster: Remove panic in UpgradeMembersWithoutRole
- Makes description columns non-nullable.
- gitignore: Ignore potential binaries
- NIC: Don’t depend on existance of dnsmasq.pid file to write static DHCP allocation file
- lxc/utils: Make byName sort all columns
- Fix cluster group handling on instance creation
- lxd/db/instance/profiles: Add missing error to stmt.Exec
- Cluster: Only take clusterMembershipMutex on leader
- lxd/db/cluster: Removes unused database views.
- Add TLS over Unix Socket support
- Cluster: Heartbeat system rework to allow for full-state member change notifications
- NIC: Don’t add auto gateway when IP family not in use
- Lxc list wrong project names for same name instances
- Disk: Fix support for bind mounting unix sockets as source by opening with O_PATH
- Cluster: Heartbeat and event tweaks
- lxd: Uses api.NewURL and sets project when querying other nodes.
- Update to gobgp v3 and drop old protobuf
- lxd/images: Don’t cleanup unknown images from shared volume
- doc: fix link in README
- Cluster: Logging consistency improvements and removes unnecessary call to EventsUpdateListeners
- Checks that the host node is clustered before editing.
- Migration: Enable TCP_USER_TIMEOUT and TCP keep alives on migration connections
- lxc/alias: Allows users to reference specific arguments.
- Cluster: If heartbeat context is cancelled during spread sleep then exit quicker
- Cluster: Event listener socket cleanup and logging improvements
- lxd/cluster/recover: Return separate error if no raft role found
LXC
- Nothing to report this week
LXCFS
Distrobuilder
Dqlite (RAFT library)
Dqlite (database)
- Nothing to report this week
Dqlite (Go bindings)
- Nothing to report this week
LXD Charm
Distribution work
This section is used to track the work done in downstream Linux distributions to ship the latest LXC, LXD and LXCFS as well as work to get various software to work properly inside containers.
Ubuntu
- Nothing to report this week
Snap
- qemu: Bumped to 6.2.0
- lxd: Cherry-pick upstream bugfixes