How to reinstall LXD host OS without losing the containers?

Note: The recommended way to do this in LXD 5.0 is to use lxd recover.
See Backing up a LXD server - LXD documentation

I want to rebuild an LXD host by reinstalling the OS, and keep the containers as they are.
The reason for reinstalling is that I want to repartition the root filesystem (which is an SSD VPS), so that I can create a new ZFS filesystem on local SSD storage, which is now part of the root filesystem.

The containers are in a separate detachable zfs filesystem which I will not touch. I will detach it from the system while I reinstall the OS, and then reattach it.
Other than repartitioning, the OS will be the same, Ubuntu 18.04 + snap lxd.
All my container configuration is done via profiles.

What is the best way to do this? Can I somehow copy and restore all of /snap? Or export and restore the lxd snap? Or just its data? By the way, can i put /snap on a zfs filesystem, which I can backup and restore at any time?

This system has about a handful of small containers, but I am also interested in a solution that scales to many containers.

I’m thinking of trying something like this:

  • Detach the zfs storage pool
  • Reinstall the OS
  • Create a temporary zfs storage pool at the same location as before
  • lxd init with the same parameters as before, including the network address
  • Create a dummy container for each old container, with the same guest OS as before.
  • Stop all containers
  • Detach the zfs storage pool filesystem and reattach the original one in its place
  • Recreate the LXD profiles and apply them to the containers.
  • Restart the containers

This will not copy the containers, although it creates temporary dummy containers.
I have tried a variation of this where instead of swapping the whole zfs storage pool, I replaced a container root filesystem, using “zfs receive” which I had previously “zfs send” from a container at another system.

Note: The recommended way to do this in LXD 5.0 is to use lxd recover.
See Backing up a LXD server - LXD documentation

  • Stop LXD (systemctl stop snap.lxd.daemon)
  • Backup /var/snap/lxd
  • Reinstall your machine
  • Install the snap again
  • Stop it again (systemctl stop snap.lxd.daemon)
  • Wipe /var/snap/lxd and restore your backup
  • Start LXD again (systemctl start snap.lxd.daemon)

You can put /var/snap on a ZFS dataset, that’s fine.
/snap isn’t very useful to keep around as that’s mostly a bunch of mountpoints of the read-only snap contents, the data is what you really care about and that’s in /var/snap.

@stgraber Can you please share same procedure but not for snap LXD?

Same thing but use “/var/lib/lxd” instead and the systemd unit is “lxd”.

I ended up using “snap save lxd”, as described here: https://snapcraft.io/docs/snapshots

Same steps as above, except instead of backup/restore /var/snap/lxd, use this:

  • snap save lxd
  • copy the snapshot file from /var/lib/snapd/snapshots to the new machine (create the directory first, as it may not exist)
  • snap restore {id}

In both cases, when I tried “lxc list” after restore, I got the error:
Error: Get http://unix.socket/1.0: dial unix /var/snap/lxd/common/lxd/unix.socket: connect: no such file or directory

It got fixed by rebooting.

Yeah, that error is caused by socket activation, you can do:

  • systemctl restart snap.lxd.daemon.unix.socket

Which would re-create the socket file and get you back online without a reboot.

@stgraber hope you are doing well. My mistake was to use the lxd/lxc version that comes with ubuntu 18.04

ii lxd 3.0.3-0ubuntu1~18.04.1 amd64 Container hypervisor based on LXC - daemon
ii lxd-client 3.0.3-0ubuntu1~18.04.1 amd64 Container hypervisor based on LXC - client

Instead of using the snap one:

$ sudo snap find lxd
Name Version Publisher Notes Summary
lxd 3.16 canonical✓ - System container manager and API
lxd-demo-server 0+git.6d54658 stgraber - Online software demo sessions using LXD
nova ocata james-page - OpenStack Compute Service (nova)
satellite 0.1.2 alanzanattadev - Advanced scalable Open source intelligence platform
nova-hypervisor ocata james-page - OpenStack Compute Service - KVM Hypervisor (nova)

Now I have a good number of containers:

±-----±--------±-----±-----±-----------±----------+
| NAME | STATE | IPV4 | IPV6 | TYPE | SNAPSHOTS |
±-----±--------±-----±-----±-----------±----------+
| c1 | STOPPED | | | PERSISTENT | 0 |
±-----±--------±-----±-----±-----------±----------+
| c10 | STOPPED | | | PERSISTENT | 0 |
±-----±--------±-----±-----±-----------±----------+
| c11 | STOPPED | | | PERSISTENT | 0 |
±-----±--------±-----±-----±-----------±----------+
| c12 | STOPPED | | | PERSISTENT | 0 |
±-----±--------±-----±-----±-----------±----------+
| c13 | STOPPED | | | PERSISTENT | 0 |
±-----±--------±-----±-----±-----------±----------+
| c2 | STOPPED | | | PERSISTENT | 0 |
±-----±--------±-----±-----±-----------±----------+
| c3 | STOPPED | | | PERSISTENT | 0 |
±-----±--------±-----±-----±-----------±----------+
| c4 | STOPPED | | | PERSISTENT | 0 |
±-----±--------±-----±-----±-----------±----------+
| c5 | STOPPED | | | PERSISTENT | 0 |
±-----±--------±-----±-----±-----------±----------+
| c6 | STOPPED | | | PERSISTENT | 0 |
±-----±--------±-----±-----±-----------±----------+
| c7 | STOPPED | | | PERSISTENT | 0 |
±-----±--------±-----±-----±-----------±----------+
| c8 | STOPPED | | | PERSISTENT | 0 |
±-----±--------±-----±-----±-----------±----------+
| c9 | STOPPED | | | PERSISTENT | 0 |
±-----±--------±-----±-----±-----------±----------+

This Profile:

lxc profile list
±--------±--------+
| NAME | USED BY |
±--------±--------+
| default | 13 |
±--------±--------+

Additional details:

lxc profile show default
config: {}
description: Default LXD profile
devices:
eth0:
name: eth0
nictype: bridged
parent: br0
type: nic
root:
path: /
pool: cstorage
type: disk
name: default

used_by:

  • /1.0/containers/c1
  • /1.0/containers/c2
  • /1.0/containers/c3
  • /1.0/containers/c4
  • /1.0/containers/c5
  • /1.0/containers/c6
  • /1.0/containers/c7
  • /1.0/containers/c8
  • /1.0/containers/c9
  • /1.0/containers/c10
  • /1.0/containers/c11
  • /1.0/containers/c12
  • /1.0/containers/c13

lxc storage list
±---------±------------±-------±------------------------------------±--------+
| NAME | DESCRIPTION | DRIVER | SOURCE | USED BY |
±---------±------------±-------±------------------------------------±--------+
| cstorage | | dir | /var/lib/lxd/storage-pools/cstorage | 14 |
±---------±------------±-------±------------------------------------±--------+

Can I migrate this version to the snap version without destroying the environment?

Thank you in advance,

The script lxd.migrate has this function. As for any migration or other critical operations, it should always be done with verified good backup first.

Hello, @gpatel-fr I will proceed and test the migration lxd.migrate from one lxd server (ubuntu 18.04) lxd 3.0 to another (ubuntu 18.10) using lxd 3.16.

Let’s see how it goes.

I guess I should back up the containers.

Thank you very much,

Err, I misunderstood you; lxd.migrate is an in-place migration. From a LXD server to another, you can just copy the containers (lxc copy, see doc for setting up a LXD instance as a server, lxc remote add )