LXD snap transitioning to core20 and losing i386 support

Hello,

This is a quick announcement that we have begun the transition of the LXD snap from core18 (Ubuntu 18.04) over to core20 (Ubuntu 20.04).

Currently this only affects the latest/edge channel, the next stage will be the setup of latest/beta which will match latest/candidate but use a core20 base and then after a few more weeks of testing, we’ll eventually flip latest/candidate and latest/stable over.

Then again a few more weeks after that and we’ll have the 4.0/candidate and 4.0/stable channels follow suit.

Most users will not see any difference other than one LXD refresh taking a bit longer than usual as the new core will be downloaded. Some others will see things working a bit better thanks to the newer bundled version of bundled utilities and the snap shrinking a bit thanks to not having to carry as many of them.

The one main downside of this transition is the loss of i386 support as Ubuntu 20.04 doesn’t have an i386 version anymore. Users of LXD on i386 represent 0.0063% of our userbase at the moment so we don’t expect this to be much of a problem.

Note that this does not affect the ability to run i386 containers on a amd64 host, this will keep working as usual.

Please use this post to report any issue you’re seeing as this gets rolled out!

1 Like

4.0/edge is being rebuilt with core20 now too.

latest/candidate is now on core20. Looking at releasing this to stable tomorrow.

This change is now in latest/stable after we made a couple of minor adjustments.

There have been a few issues caused by the switch to core20 which I thought would be worth documenting in case any body is affected:

Core20 prevented LXD’s ability to detect and use the xtables firewall driver

LXD supports two firewall drivers; xtables (iptables, ip6tables, ebtables and arptables), and nftables.
LXD will detect the presence of the firewall commands available, as well as the rule sets already in use, and will pick what it thinks is the most appropriate firewall driver to use.

LXD relies on the full featured “legacy” xtables tool set for its xtables driver, and the “shim” equivalent xtables tools provided by nftables do not support the full feature set compared to the legacy tools. This means that if LXD detects that the xtables tools are in fact nftables shim, it will consider that the xtables driver cannot be used, and will switch to the nftables driver.

The LXD snap bundles both the xtables tools and the nftable tool, which means LXD will pick which driver to use based on the kernel version and current active rule sets in use on the system, rather than the tooling available.

Unfortunately the core20 snap has a mixture of legacy and nftables shim xtables commands. The iptables and ip6tables commands are “legacy”, but the ebtables and arptables commands are nftables shims.

This was causing LXD to consider the xtable toolset incomplete and was preferring to use the nftables driver even if there were active xtables rule sets in use. This then could cause issues if the xtables rule sets had DROP rules for unmatched traffic, as LXD would not add its own allow rules into those rule sets. This could cause connectivity issues for LXD instances.

A fix has been added to the snap to override the core20 xtables tooling to ensure legacy tooling is available.

We also need to switch to using an internal mutex for concurrent locking of the ebtables command that is part of the xtables driver, as that was relying on being able to create a lock at /var/lib/ebtables/lock which was not modifiable and not accessible from inside the snap.

We also needed to reintroduce /etc/ethertypes files to satisfy ebtables:

Core20 contains a newer version of dnsmasq that broke some users with raw.dnsmasq custom network settings

The core20 snap containers a newer version of dnsmasq compared to the old core18 snap. This newer version caused some network setups to break due to the use of custom settings via the raw.dnsmasq network setting.

Specifically the use of auth-zone in raw.dnsmasq caused the following dnsmasq start up error:

dnsmasq: --auth-server required when an auth zone is defined.

The fix is to remove or correct the use of this setting using:

lxc network set <network> raw.dnsmasq=<corrected value>

However LXD was not making the specific error easy to identify as it did not log the dnsmasq start up failure stderr output.

This has now been improved in:

And the same scenario will now generate an error in the logs like:

The dnsmasq process exited prematurely   project=default driver=bridge network=lxdbr0 stderr="dnsmasq: --auth-server required when an auth zone is defined." err="Process exited with non-zero value 1"

Core20 contains a newer version of lvm tools that do not scan logical volumes for nested physical volumes

The core20 snap containers a newer version of the lvm2 tools compared to the core18 snap. There was a change in behaviour between those versions that impacted users who relied on LVM scanning their logical volumes for nested physical volumes and volume groups.

In newer versions of the lvm2 tools this feature is disabled by default to improve performance, whereas in the older version it was always on.

If you rely on such a setup then you can instruct the snap package to use the lvm tools on the host and not use the bundled tools by doing:

snap set lxd lvm.external=true
systemctl reload snap.lxd.daemon

If your host OS is running Ubuntu 20.04 or later and using nested volume groups then you may find that you were inadvertently depending on the old core18 snap’s behaviour of scanning all logical volumes.

If this is the case you will find that even setting snap set lxd lvm.external=true doesn’t fix the issue.

In this case you should modify /etc/lvm/lvm.conf on the host and find scan_lvs = 0 and set it to scan_lvs = 1 and then restart your system.

You should then see your missing volume group in the sudo vgs command output and LXD should be operational again.

Core20 containers a newer version of unsquashfs tool that is more strict when unpacking devices in an unprivileged environment

The newer version of unsquashfs is more strict when unpacking devices and would return a non-zero exit code which LXD would consider a failure.

This has been fixed by inspecting the stderr output and checking the error is only about unpacking devices, which cannot succeed in unprivileged environment.

1 Like