Disable snap auto refresh immediately!

Sorry if this thread sounds condescending, but I urge everyone running LXD in a production environment to disable snap auto-refresh immediately! Due to some bugs in LXCFS, the recent snap update from LXD 3.14 to 3.15 has caused many hundreds of production containers to go offline. Personally, I have +700 containers running that are affected by the LXCFS bug, and I see a number of other uses having similar issues. To make matters worse, there is no way to completely disable the auto-refresh process. This means your system is at risk for any future updates if you allow snap updates to occur.

That said, there are a few workarounds to fool the snap tool from auto updating. Simply run the following commands on ALL your LXD servers:

snap set system proxy.http="http://127.0.0.1:1111"
snap set system proxy.https="http://127.0.0.1:1111"

–or–

You can run these IPTables commands:

 iptables -F
 iptables -A OUTPUT -d api.snapcraft.io -j DROP
 iptables-save

Please note: I have been a big supporter of LXD for >2yrs and appreciate and support all the LXD work done by Stephane, Christian, and team. They have created a remarkable tool that gives us an excellent container environment for free. However, we should always vet any software updates on test servers before going into production. Relying on someone else’s test tools to validate production software is a huge risk - especially when our livelihoods depend on uptime and stability.

The inability to completely disable snap auto updates is the root cause of all these issues (https://forum.snapcraft.io/t/disabling-automatic-refresh-for-snap-from-store/707/250). It seems the snap developers don’t really care about production environments…

3 Likes

I completely agree, the ability to test upgrades is absolutely critical to production use of LXD. There’s no way for the dev team to test every configuration prior to release, so it’s up to end users to test, which is totally reasonable, but there’s no straightforward mechanism to do so with snap.

I totally understand that, I had some outages because of the auto-update feature of snap.
And thank you for posting the workaround to stop the auto-updating.

You can try to automate snap in offline mode. Make sure your host is not connected to the internet. This is definitely recommended for hypervisors and therefore also the LXD hosts. (my opinion for LXD hosts)

Download snap somewhere else with:
snap download lxd

Copy the files to your LXD node and install the snap:
snap ack <package.assert>
snap install <package.snap>

I think this is a very safe and useful way to patch your hosts with snap. I haven’t tested or automated it yet, but it seems like a good method for LXD hosting with customer-containers.

Edit: You can also use your own (internet) repo container for linux updates or Landscape/Ansible for automating the whole process.

1 Like

While the approach above works to cut off all access to the snap store, it will also keep you from getting any emergency security update we may roll out. Those happen reasonably often and don’t always get advertised unless they directly related to LXC, LXD or LXCFS.

For example a security fix in OpenSSL will require a new core package to be fixed, so you will need to stay pretty up to date as to potential security issues and manually apply those as needed.

The recommended way to handle snaps in production is:

  • Set your refresh window
  • To be able to easily control the snap revisions that you roll out in your production environment, consider setting up a snap store proxy. This does allow you to keep specific snaps to specific revisions, then once you’re satisfied the new revision works for you, you can bump it in the proxy and all your systems will refresh to that (at the scheduled time).

Details on the snap proxy can be found here: https://docs.ubuntu.com/snap-store-proxy/en/

The download+ack+install approach from @TomvB will also work fine, though you may need to prevent store connectivity on top of that as it’s likely that the assertion will effectively tell snapd about what channel this came from, causing it to attempt to handle regular refreshes (unlike a fully sideloaded snap which is missing that assertion data).

On our side, one thing I’m looking at setting up and automating is a set of extra tracks.
If we had the setup I have in mind right now, it would look like:

  • 2.0
  • 3.0
  • 3.14 (new)
  • 3.15 (new)
  • latest

The 3.14 track would contain whatever was in latest at the time 3.15 released.
The 3.15 track would mirror latest until 3.16 releases.

When 3.16 releases, we’d get rid of the 3.14 track. So you’d just have 3.15 and 3.16 at that point.

Because track changes aren’t automatic, those deciding to use 3.14, will have to manually refresh to 3.15, … If you don’t do anything, you end up staying behind on an unsupported release.

We’d only keep the current and previous release open so that we don’t end up with users attempting to install very old releases by following outdated howtos and the like.

4 Likes

The 3.15/stable and 3.15/candidate channels are now populated and will keep in sync with their latest equivalent until 3.16 gets released. Those switching to that will get the normal bugfixes during the 3.15 lifetime but will then get stuck on 3.15 unless they manually request a refresh onto 3.16.

1 Like

Thanks Stephane. As usual, you guys are always on top of these sorts of issues. Too bad the underlying tool (snap) is the root of all the problems.

The latest reply to a similar topic in the forum displayed an alternative solution in the form of --dangerous;

Agreed, unfortunately the 4.1 upgrade screwed me up bad. Was hoping that reboot would make a difference not such. LXD in production is like Russian roulette every time you reboot or upgrade.