Disaster recovery on home office server (ZFS backend)

Hello,

Been going around in circles for a few days now so hopefully someone can help point me in the right direction.

I had to a clean install of my home office server and never backed up my LXD folder directly, so lost the database and LXD then knew nothing of my containers which where all fine in a ZFS storage pool.

I tried to get the LXD working with the existing dataset but it would not have that (which I now know if expected)

I then created a new dataset in the same pool and managed to get LXD up and running. Obviously without the old containers.

I then read I needed to mount the containers and use lxd import. I could not get this to work with the error that the directory was empty. I then understood from LXD import not working that it needed to be mounted in the namespace.

Which I was then able to do and thought I had a winner… but then got an obscure message which I cannot find any threads for:

Error: The instance [container name] seems to exist on multiple storage pools

So I’m stumped and out of my depth. Any help greatly appreciated.

Thanks in advance.

The storage pool name must match that of the source.
You mention you had LXD create a new storage pool in a different dataset. If you have this named the same as your old pool, that would probably be the issue.

Short version is that you need:

  • /var/snap/lxd/common/lxd/storage-pools/POOL/containers/NAME mounted
  • POOL must match the pool name listed in backup.yaml at /var/snap/lxd/common/lxd/storage-pools/POOL/containers/NAME
  • You can’t already have a container named NAME in the database

Thanks, really appreciate the time. I’m still going around in circles, most likely due to something not obvious to me but that is ion fact really simple. So apologies in advance.

I tried changing the backup.yaml info and deleted the folder in the old mount point which got rid of the error but then got a new one;

$ lxd import default-minimal
Error: Checking snapshots: Failed to run: zfs get -H -r -o name name datasilo1/lxd2/containers/default-minimal: cannot open 'datasilo1/lxd2/containers/default-minimal': dataset does not exist

I have added my pool structure for both the old and new datasets below. As part of the steps I did previsouly, I changed the name of the “old” dataset from datasilo1/lxd to datasilo1/lxd-old and for the current LXD set-up I created a new dataset datasilo/lxd2

There are two containers I want to try save: “apps-services” and “default-minimal”

Below the zfs list is the relevant bits from the original backup.yaml

$zfs list | grep lxd

datasilo1/lxd-old                                                                                  13.2G  3.50T      128K  none
datasilo1/lxd-old/containers                                                                       13.0G  3.50T      128K  none
datasilo1/lxd-old/containers/apps-services                                                         12.7G  3.50T     12.9G  /var/snap/lxd/common/lxd/storage-pools/datasilo1/containers/apps-services
datasilo1/lxd-old/containers/default-minimal                                                        255M  3.50T      397M  /var/snap/lxd/common/lxd/storage-pools/datasilo1/containers/default-minimal
datasilo1/lxd-old/custom                                                                            128K  3.50T      128K  none
datasilo1/lxd-old/deleted                                                                           237M  3.50T      128K  none
datasilo1/lxd-old/deleted/containers                                                                128K  3.50T      128K  none
datasilo1/lxd-old/deleted/custom                                                                    128K  3.50T      128K  none
datasilo1/lxd-old/deleted/images                                                                    236M  3.50T      128K  none
datasilo1/lxd-old/deleted/images/87b2e5de9de0c2337045b609e6475ba30763f106c6e4c9aa666e6a960cd1820f   236M  3.50T      236M  none
datasilo1/lxd-old/deleted/virtual-machines                                                          128K  3.50T      128K  none
datasilo1/lxd-old/images                                                                            128K  3.50T      128K  none
datasilo1/lxd-old/snapshots                                                                         128K  3.50T      128K  none
datasilo1/lxd-old/virtual-machines                                                                  128K  3.50T      128K  none
datasilo1/lxd2                                                                                      681M  3.50T      128K  none
datasilo1/lxd2/containers                                                                          91.6M  3.50T      128K  none
datasilo1/lxd2/containers/test                                                                     91.5M  3.50T      376M  /var/snap/lxd/common/lxd/storage-pools/default/containers/test
datasilo1/lxd2/custom                                                                               128K  3.50T      128K  none
datasilo1/lxd2/deleted                                                                              295M  3.50T      128K  none
datasilo1/lxd2/deleted/containers                                                                   128K  3.50T      128K  none
datasilo1/lxd2/deleted/custom                                                                       128K  3.50T      128K  none
datasilo1/lxd2/deleted/images                                                                       294M  3.50T      128K  none
datasilo1/lxd2/deleted/images/566410cdfa395c7ae22489a9998e7626d95252451e6d2d3db30e52a77a4b3d83      294M  3.50T      294M  /var/snap/lxd/common/lxd/storage-pools/default/images/566410cdfa395c7ae22489a9998e7626d95252451e6d2d3db30e52a77a4b3d83
datasilo1/lxd2/deleted/virtual-machines                                                             128K  3.50T      128K  none
datasilo1/lxd2/images                                                                               294M  3.50T      128K  none
datasilo1/lxd2/images/cb3bea6bc536e3cb6ebbfd37cdd46203c4abe84bdc095aa5f418e9bc6e7b2327              294M  3.50T      294M  /var/snap/lxd/common/lxd/storage-pools/default/images/cb3bea6bc536e3cb6ebbfd37cdd46203c4abe84bdc095aa5f418e9bc6e7b2327
datasilo1/lxd2/virtual-machines                                                                     128K  3.50T      128K  none

Current edited config for default-minimal:

pool:
  config:  
    source: datasilo1/lxd-old
    volatile.initial_source: datasilo1/lxd-old
    zfs.pool_name: datasilo1/lxd-old   
  description: "" 
  name: default
  driver: zfs
  used_by: []
  status: Created   
  locations:
  - none

Original config for default-minimal:

pool:
  config:  
    source: datasilo1/lxd
    volatile.initial_source: datasilo1/lxd
    zfs.pool_name: datasilo1/lxd
  description: "" 
  name: datasilo1
  driver: zfs
  used_by: []
  status: Created   
  locations:
  - none

As I mentioned before, I’m a bit out of my depth and appreciate the effort and the time in helping me. Thank you.

Hard lesson learnt about those backups…

Brendan

Assuming you don’t care about your test container, your best bet since your current LXD is unconfigured would be to:

  • undo any changes you may have done to backup.json files
  • snap remove lxd
  • snap install lxd
  • zfs destroy -r datasilo1/lxd2
  • zfs rename datasilo1/lxd-old datasilo1/lxd2
  • snap install lxd
  • nsenter + mkdir + mount of the two datasets
  • lxd import apps-services
  • lxd import default-minimal

And then add the rest of your LXD config like networking, profiles, …

That’s how disaster recovery is meant to work. It’s expecting the pools and datasets to have the exact same name they used to, trying to mangle things into importing in another pool will just lead to trouble.

Thank you.

But I’m not following. Not sure if I am missing the obvious here so please bare with me. My progress and issues below:

  • undo any changes you may have done to backup.json files [completed]
  • snap remove lxd [completed]
  • snap install lxd [completed]
  • zfs destroy -r datasilo1/lxd2 [completed]
  • zfs rename datasilo1/lxd-old datasilo1/lxd2 [not done]
    • Should this not be “zfs rename datasilo1/lxd-old datasilo1/lxd” as that was what the original was?
  • snap install lxd <<- stuck here as this is already installed
  • nsenter + mkdir + mount of the two datasets
    • there are no directories under /var/snap/lxd/common/lxd/ so not sure how to proceed.
  • lxd import apps-services
  • lxd import default-minimal

Additional query:

  • When do I run “lxd init”?

Ah sure, yeah, rename to whatever the dataset was before, I was just making a guess based on the output you pasted so far.

Try running lxc info this should cause things to appear under /var/snap/lxd/common/lxd

When doing disaster recovery, you never run lxd init.

A quick thank you! I’m away for a bit but will try this when I get back next week.

Gosh. Still very out of my depth here. So thank you for your time, it’s much appreciated.

I have managed to import my ‘default-minimal’ container and it can boot. All seems clean and working save some config issues (see below).

Not able to import ‘apps-services’ container. It complains about:

$ lxd import apps-services
Error: Create instance: Requested profile 'lanprofile' doesn't exist

Now reading your comment, “add the rest of your LXD config like networking, profiles” etc I think this is expected.

My Google skills are letting me down though. I cannot find a clear example of how to re-create a default set of profiles. Without this, even my basic default-minimal container is without a network/IP.

Can someone point me in the right direction please?

From there I believe I’ll be able to customize a profile ‘lanprofile’ as needed for my ‘apps-services’ container which needs a local LAN IP and to be accessible to the local network. That container’s original config for networking is as follows;

eth0:
      name: eth0
      nictype: macvlan
      parent: eno1
      type: nic  

Thanks All.

lxc profile create NAME
lxc profile edit NAME

For each of the expected profiles.

Just a short note to say thank you for your time and assistance.

The last piece of the puzzle was recreating those profiles which I did as you instructed.

I then took the network information from the containers backup.yaml file and added that to the new profile I created.

Restarted the LXD daemon and bamp! Off we go :slight_smile:

Still some minor tweaks to do, but generally everything working again as expected.

Now I’m off to go back up my config!!!

Thank you.

Great to hear that the process (as rough as it still is) worked for you and that you got your data back.