[Help] Container present on ZFS but not showing up in list

despens · August 2, 2023, 4:00pm

I just restored a LXD setup from a full disk backup. The main Ubuntu operating system was stored on a regular ext4 formatted boot disk, and the default LXC storage pool was located on a zfs disk.

The backup schedules of the two disks were out of sync, so the zfs disk contains all the latest container data, but the boot disk was snapshotted less frequently. Now I see that LXC is not seeing one container that was created after the boot disk snapshot.

The container is visible as a zfs dataset

$ sudo zfs list
[...many lines...]
lxc-storage/lxc-zfs/containers/lost-container
[...many lines...]

Is there a way to make lost-container known again to LXC?

All other containers from the backup function perfectly.

The system is running LXD 4.0.9 from snap.

Any help would be much appreciated.

stgraber · August 2, 2023, 4:56pm

LXD is no longer a LinuxContainers project and you should therefore go to Canonical’s forum to get LXD support these days. See https://linuxcontainers.org/lxd

In this instance though, I’d recommend you upgrade to LXD 5.0.x (5.0/stable track) which will then get you access to the newer lxd recover command. That command is capable of scanning storage pools for instances and volumes to recover and so should be able to help you out here.

If you must stick with LXD 4.0.x, lxd import may be able to handle it, but it’s quite a bit more manual and finicky than lxd recover.

despens · August 2, 2023, 7:32pm

Thank you Stephane. I am painfully aware that Canonical has removed LXD project from Linuxcontainers, but was hoping to still run into the usual friendly people here. On Canonical’s forum you need to raise in the ranks before you can post even… Enough complaining, I’ll try LXD 5 and will report back here.

despens · September 28, 2023, 2:26pm

OK, I was setting up a new server with LXD 5 just for the purpose of the recovery. The production server still runs LXD 4. I cloned the zfs storage pool and attached it to the “rescue server”.

The lxd recover command worked flawlessly () and the missing containers are available and working on the rescue server.

However I am unable to transfer them to the production server because of new configuration keys in LXD 5 that are not present in LXD 4. (Related thread)

It looks like this:

# lxc copy rescue:container-1 container-1
Error: Failed instance creation: Error transferring instance data: Failed creating instance snapshot record "container-1/snapshot-1": Unknown configuration key: volatile.uuid.generation

On the rescue server I deleted this key volatile.uuid.generation from the container config, and verified with lxc config show that it is indeed not present anymore. But the error message on the import side remains. Probably because this is a new default key that gets automatically set? Is there a way to import this container back in LXD 4?

despens · October 3, 2023, 12:49pm

I was able to locate the configuration key in the exported tarball, file backup/index.yaml:

config:
  container:
    config:
      volatile.cloud-init.instance-id: 171c6842-b6d0-4d8b-99d1-166b2ff66645

So I extracted index.yaml, deleted the offending line, and updated the tarball with the new version of the file:

$ tar xvf container.tar backup/index.yaml
$ nano backup/index.yaml
$ touch backup/index.yaml
$ tar rf container.tar --owner root --group root backup/index.yaml

Verified the new version is in there:

$ tar tvf container.tar | grep index.yaml
-rw-r--r-- root/root     11310 2023-10-03 05:59 backup/index.yaml
-rw-r--r-- root/root     11158 2023-10-03 12:40 backup/index.yaml

Yet on import I get the same error message:

$ lxc import container.tar
Error: Failed importing backup: Failed creating instance record: Unknown configuration key: volatile.cloud-init.instance-id

despens · October 4, 2023, 12:12pm

Moving a container from an LXC 5 to an LXC 4 instance

It is not possible to transfer a container from a LXD 5 to an LXD 4 host with the regular lxc copy function, because LXD 5 introduced some new metadata keys that LXD 4 rejects.

Instead, create a backup tarball of the container you want to transfer:

lxc export containername --compression none

Compression is disabled because editing a tarball is only possible when it is not compressed.

Next, move the tarball to the target LXD 4 host.

The offending metadata keys are contained in the tarball as yaml files. The tarball will typically contain all the snapshots associated with the container, each might hold very similar metadata. You need to make sure to remove the offending metadata keys from every configuration file.

Extract the metadata yaml files from the tarball:

tar xvf containername.tar --wildcards '*/index.yaml' '*/backup.yaml'

This will create a new directory tree called backup. Examine all yaml files contained therein and remove lines with the following keys:

volatile.uuid.generation
volatile.cloud-init.instance-id
volatile.last_state.ready

There might be more incompatible keys, these are the 3 I found during my tests with a handful of containers.

When you’re done with the edits, append the metadata files to the tarball:

tar rf containername.tar --owner root --group root backup

Now the tarball is ready to import:

lxc import containername.tar

Should you get an error message like

Error: Failed importing backup: Failed creating instance record: Unknown configuration key: some.metadata.key

you can remove this key from all yaml files and update the tarball again. Perhaps leave a comment here if you discover a new incompatible key.