Recover LXD storage pools and containers

harlanb2002 · August 13, 2020, 5:43am

I’ve searched and I can’t seem to find a way to get this done. Environment: Ubuntu 18.04, lxd/lxd 3.0.0 that were installed originally and currently using aptitude. I created a storage pool using dir type that pointed to a directory on a ZFS pool because of redundancy in case of drive failure. Those drives are fine, it was the system drive that had the problems; been replaced. I was able to get the /var/lib/lxd directory copied. Putting that on the new system didn’t help. When I try to do "lxd import " it fails with the usual error of ‘Error: The container “mail” does not seem to exist on any storage pool’. I haven’t been able to figure out how to get the new installation of lxd to point the storage pool definition to the directory on the zfs pool. If I can get that working, then I’m going to try to import the containers.

Any help would appreciated!

Thank You for your assistance.

Harlan…

stgraber · August 13, 2020, 1:33pm

You need to make sure that there is data at /var/lib/lxd/storage-pools/POOL/containers/mail/backup.yaml in this case.

POOL must match the name of the pool on the original system.

harlanb2002 · August 13, 2020, 2:35pm

Hi Stephane,
Thanks for responding. Another question: When I used the command lxc storage create data2dir dir source=/data/lxd_containers2 (I think I have the syntax messed up, but I’m sure you get the idea), how does that link from /var/lib/lxd/storage-pools/data2dir (which is just an empty directory on the original system) to the directory I provide? Because the volume is a ZFS pool, I’m not able to use the mount command.

Thanks,

Harlan…

stgraber · August 13, 2020, 4:03pm

It’s a bind-mount which LXD sets up, so you’ll want to do:

mount -o bind /data/lxd_containers2 /var/lib/lxd/storage-pools/data2dir

harlanb2002 · August 13, 2020, 4:48pm

I did the mount command you sent. However, lxd import and lxc ls just hang and appear to be doing nothing. Any suggestions on what else am I missing?

harlanb2002 · August 13, 2020, 4:54pm

I noticed this in the lxd.log file:
t=2020-08-13T11:44:18-0500 lvl=info msg=“Initializing local database”
t=2020-08-13T11:44:18-0500 lvl=eror msg=“Failed to start the daemon: Both legacy and new local database files exists”
t=2020-08-13T11:44:18-0500 lvl=info msg=“Starting shutdown sequence”
t=2020-08-13T11:44:18-0500 lvl=info msg=“Saving simplestreams cache”
t=2020-08-13T11:44:18-0500 lvl=info msg=“Saved simplestreams cache”

ps -elf |grep -i lxd

4 S root 6288 6258 0 80 0 - 131759 futex_ 11:42 pts/1 00:00:00 /usr/lib/lxd/lxd import mail
4 S root 6346 1 0 80 0 - 150608 futex_ 11:44 ? 00:00:00 /usr/lib/lxd/lxd waitready --timeout=600
0 S root 8296 6373 0 80 0 - 3607 pipe_w 11:52 pts/0 00:00:00 grep --color=auto -i lxd

ps -elf |grep -i lxc

4 S root 4854 1 0 80 0 - 23885 futex_ 11:34 ? 00:00:00 /usr/bin/lxcfs /var/lib/lxcfs/
0 S root 8298 6373 0 80 0 - 3607 pipe_w 11:52 pts/0 00:00:00 grep --color=auto -i lxc

Not sure if it helps…

stgraber · August 13, 2020, 7:23pm

Ok, so you first need a working LXD daemon, lxc info needs to work and show you that the daemon is responding.

In your case, it isn’t because:

 t=2020-08-13T11:44:18-0500 lvl=eror msg=“Failed to start the daemon: Both legacy and new local database files exists”

This suggests that you have a lxd.db or similar file on there causing LXD to get confused.

harlanb2002 · August 14, 2020, 5:06am

I cleaned out and reset lxd. Using the mount command above I was able to do lxd import ; I have a few of them. When I started a container, I noticed there was no IP address. Dawned on me that I forgot to do lxd init, so I did that, got the bridge working, connected containers to the bridge using “lxc config device add mail eth0 nic nictype=bridged parent=lxdbr0 name=eth0”. I thought a reboot would be in order to make sure everything is working correctly. The process for lxd is running, but not much else for lxd; no bridge either. Here is the error when I try to start stuff:

/etc/init.d/lxd start

[…] Starting lxd (via systemctl): lxd.serviceJob for lxd.service failed because the control process exited with error code.
See “systemctl status lxd.service” and “journalctl -xe” for details.
failed!
root@filesrv2:~# systemctl status lxd.service
● lxd.service - LXD - main daemon
Loaded: loaded (/lib/systemd/system/lxd.service; indirect; vendor preset: enabled)
Active: activating (start-post) since Thu 2020-08-13 23:46:50 CDT; 2min 23s ago
Docs: man:lxd(1)
Process: 6991 ExecStartPre=/usr/lib/x86_64-linux-gnu/lxc/lxc-apparmor-load (code=exited, status=0/SUCCESS)
Main PID: 7008 (lxd); Control PID: 7009 (lxd)
Tasks: 19
CGroup: /system.slice/lxd.service
├─7008 /usr/lib/lxd/lxd --group lxd --logfile=/var/log/lxd/lxd.log
└─7009 /usr/lib/lxd/lxd waitready --timeout=600

Aug 13 23:46:50 filesrv2 lxd[7008]: t=2020-08-13T23:46:50-0500 lvl=warn msg=“CGroup memory swap accounting is disabled, swap limits will be ignored.”
Aug 13 23:46:50 filesrv2 systemd[1]: lxd.service: Service hold-off time over, scheduling restart.
Aug 13 23:46:50 filesrv2 systemd[1]: lxd.service: Scheduled restart job, restart counter is at 2.
Aug 13 23:46:50 filesrv2 systemd[1]: Stopped LXD - main daemon.
Aug 13 23:46:50 filesrv2 systemd[1]: Starting LXD - main daemon…
Aug 13 23:48:17 filesrv2 lxd[7008]: t=2020-08-13T23:48:17-0500 lvl=warn msg="Failed connecting to global database (attempt 6): failed to create dqlite
Aug 13 23:48:30 filesrv2 lxd[7008]: t=2020-08-13T23:48:30-0500 lvl=warn msg="Failed connecting to global database (attempt 7): failed to create dqlite
Aug 13 23:48:43 filesrv2 lxd[7008]: t=2020-08-13T23:48:43-0500 lvl=warn msg="Failed connecting to global database (attempt 8): failed to create dqlite
Aug 13 23:48:56 filesrv2 lxd[7008]: t=2020-08-13T23:48:56-0500 lvl=warn msg="Failed connecting to global database (attempt 9): failed to create dqlite
Aug 13 23:49:08 filesrv2 lxd[7008]: t=2020-08-13T23:49:08-0500 lvl=warn msg="Failed connecting to global database (attempt 10): failed to create dqlit

What am I missing?

Thanks again for your help!

harlanb2002 · August 14, 2020, 5:50am

I thought I’d try something. I removed the lxd and lxd-client. Then I installed lxd using snap. Newer version. I did the lxd init. The lxdbr0 bridge was up. Rebooted to make sure everything was fine. It took a while for the bridge to come up, but finally did.

Would it be better to continue with version 4 under snap or go back to version 3 under apt? In either case, I still need to get my existing containers running reliably.

Thank You again for your help!!!

Harlan…

stgraber · August 14, 2020, 1:02pm

LXD 4.0 will be more stable in general and the snap makes it easier to switch to whatever version you want from that point on.

So with the snap version of LXD running, you’ll still need to mount your pool in the right place before lxd import will work.

In this case, it will look like:

mkdir /var/snap/lxd/common/lxd/storage-pools/data2dir
nsenter --mount=/run/snapd/ns/lxd.mnt mount -o bind /data/lxd_containers2 /var/snap/lxd/common/lxd/storage-pools/data2dir

At which point lxd import NAME should work.

harlanb2002 · August 14, 2020, 1:32pm

Hi,
Getting some errors. I was looking up the syntax of the mount after it failed, just in case and I came across the nsenter command to show the pid; it failed to eventhough the pid does actually exist and everything appears to be running, unless I’m missing something.

ls /data/lxd-containers2/

cache containers devices devlxd disks images logs lxd.db networks raft security server.crt server.key shmounts snapshots storage-pools

nsenter --mount=/run/snapd/ns/lxd.mnt mount -o bind /data/lxd-containers2 /var/snap/lxd/common/lxd/storage-pools/data2dir

mount: /var/snap/lxd/common/lxd/storage-pools/data2dir: special device /data/lxd-containers2 does not exist.

nsenter -t $(cat /var/snap/lxd/common/lxd.pid) -m

mesg: ttyname failed: No such device

lxc ls

±-----±------±-----±-----±-----±----------+
| NAME | STATE | IPV4 | IPV6 | TYPE | SNAPSHOTS |
±-----±------±-----±-----±-----±----------+

ls /var/snap/lxd/common/

config lxc lxcfs.pid lxd lxd.pid mntns ns shmounts state var

cat /var/snap/lxd/common/lxd.pid

5112

Again, I appreciate your time and effort in helping me!!!

Harlan…

stgraber · August 14, 2020, 1:50pm

Oh right, oops:

nsenter --mount=/run/snapd/ns/lxd.mnt mount -o bind /var/lib/snapd/hostfs/data/lxd-containers2 /var/snap/lxd/common/lxd/storage-pools/data2dir

harlanb2002 · August 14, 2020, 2:19pm

nsenter --mount=/run/snapd/ns/lxd.mnt mount -o bind /var/lib/snapd/hostfs/data/lxd-containers2 /var/snap/lxd/common/lxd/storage-pools/data2dir

nsenter: reassociate to namespace ‘ns/mnt’ failed: Invalid argument

ls /var/lib/snapd/hostfs/data/

Learning backups lxd-containers2 websites
root@filesrv2:/# ls /var/snap/lxd/common/lxd/storage-pools
data2dir default

stgraber · August 14, 2020, 2:22pm

Hmm, that error sounds like you’re running it from a previous nsenter session somehow.
Maybe try from another terminal?

harlanb2002 · August 14, 2020, 5:57pm

Hi Stephane,
I didn’t even know nsenter created a shell. But you were correct, I was unknowingly doing that. Thank You VERY much for your assistance. I am running again!!!

As soon as I can figure it out, I’ll mark this thread resolved.

Harlan…