Issue migrating backed up snap lxd server installation to new computer

Good Evening all,

My boot drive on my server is in a failing state. I was able to successfully dd the entirety of the /var/snap/lxd/* dir. Since I wanted to upgrade to an nvme drive I went ahead and did an mobo/cpu upgrade as well(not sure if this would affect anything).

I then did a fresh install of ubuntu server and ran through the following steps to attempt to restore the snap lxd install.

$ sudo apt purge lxd
$ sudo snap install lxd
$ sudo snap stop lxd
$ sudo rm -rf /var/snap/lxd/*
$ sudo cp -R /mnt/backup/extracted-lxd-dir/var/snap/lxd/* /var/snap/lxd/
$ sudo snap start lxd
$ sudo snap start lxd
$ lxc list
Error: Get http://unix.socket/1.0: dial unix /var/lib/lxd/unix.socket: connect: no such file or directory

I’m at a loss at where to go from here. If I could recover just the containers from the recovered snap install and start everything else from fresh I would be content. That being said I still have access to the failing drive and can likely boot to it and access the entirety of the original lxd install.

Thoughts?

Thanks,
Jonathon

EDITS:

Error: Get http://unix.socket/1.0: dial unix /var/lib/lxd/unix.socket: connect: no such file or directory
$ sudo lxd
WARN[09-11|00:03:16] CGroup memory swap accounting is disabled, swap limits will be ignored.
EROR[09-11|00:03:16] Failed to start the daemon: ZFS storage pool "lxd_zfs" could not be imported:
Error: ZFS storage pool "lxd_zfs" could not be imported:
$ sudo systemctl status lxd
Unit lxd.service could not be found.
$ sudo systemctl status snap.lxd.activate.service
● snap.lxd.activate.service - Service for snap application lxd.activate
   Loaded: loaded (/etc/systemd/system/snap.lxd.activate.service; enabled; vendor preset: enabled)
   Active: inactive (dead) since Tue 2019-09-10 23:52:23 MDT; 11min ago
  Process: 21393 ExecStart=/usr/bin/snap run lxd.activate (code=exited, status=0/SUCCESS)
 Main PID: 21393 (code=exited, status=0/SUCCESS)

Sep 10 23:52:23 homeserver systemd[1]: Starting Service for snap application lxd.activate...
Sep 10 23:52:23 homeserver lxd.activate[21393]: => Starting LXD activation
Sep 10 23:52:23 homeserver lxd.activate[21393]: ==> Loading snap configuration
Sep 10 23:52:23 homeserver lxd.activate[21393]: ==> Checking for socket activation support
Sep 10 23:52:23 homeserver lxd.activate[21393]: ==> Setting LXD socket ownership
Sep 10 23:52:23 homeserver lxd.activate[21393]: ==> LXD never started on this system, no need to start it now
Sep 10 23:52:23 homeserver systemd[1]: Started Service for snap application lxd.activate.
$ sudo systemctl status snap.lxd.
snap.lxd.activate.service    snap.lxd.daemon.service      snap.lxd.daemon.unix.socket
$ sudo systemctl status snap.lxd.daemon.service
● snap.lxd.daemon.service - Service for snap application lxd.daemon
   Loaded: loaded (/etc/systemd/system/snap.lxd.daemon.service; static; vendor preset: enabled)
   Active: failed (Result: exit-code) since Tue 2019-09-10 23:52:42 MDT; 11min ago
  Process: 23219 ExecStart=/usr/bin/snap run lxd.daemon (code=exited, status=1/FAILURE)
 Main PID: 23219 (code=exited, status=1/FAILURE)

Sep 10 23:52:42 homeserver systemd[1]: snap.lxd.daemon.service: Service hold-off time over, scheduling restart.
Sep 10 23:52:42 homeserver systemd[1]: snap.lxd.daemon.service: Scheduled restart job, restart counter is at 10.
Sep 10 23:52:42 homeserver systemd[1]: Stopped Service for snap application lxd.daemon.
Sep 10 23:52:42 homeserver systemd[1]: snap.lxd.daemon.service: Start request repeated too quickly.
Sep 10 23:52:42 homeserver systemd[1]: snap.lxd.daemon.service: Failed with result 'exit-code'.
Sep 10 23:52:42 homeserver systemd[1]: Failed to start Service for snap application lxd.daemon.
$ sudo systemctl status snap.lxd.daemon.unix.socket
● snap.lxd.daemon.unix.socket - Socket unix for snap application lxd.daemon
   Loaded: loaded (/etc/systemd/system/snap.lxd.daemon.unix.socket; enabled; vendor preset: enabled)
   Active: failed (Result: service-start-limit-hit) since Tue 2019-09-10 23:52:42 MDT; 11min ago
   Listen: /var/snap/lxd/common/lxd/unix.socket (Stream)

Sep 09 08:32:10 homeserver systemd[1]: Listening on Socket unix for snap application lxd.daemon.
Sep 10 23:52:42 homeserver systemd[1]: snap.lxd.daemon.unix.socket: Failed with result 'service-start-limit-hit'.
$
$
$ sudo cat /var/snap/lxd/common/lxd/logs/lxd.log
t=2019-09-10T23:52:41-0600 lvl=info msg="LXD 3.17 is starting in normal mode" path=/var/snap/lxd/common/lxd
t=2019-09-10T23:52:41-0600 lvl=info msg="Kernel uid/gid map:"
t=2019-09-10T23:52:41-0600 lvl=info msg=" - u 0 0 4294967295"
t=2019-09-10T23:52:41-0600 lvl=info msg=" - g 0 0 4294967295"
t=2019-09-10T23:52:41-0600 lvl=info msg="Configured LXD uid/gid map:"
t=2019-09-10T23:52:41-0600 lvl=info msg=" - u 0 1000000 1000000000"
t=2019-09-10T23:52:41-0600 lvl=info msg=" - g 0 1000000 1000000000"
t=2019-09-10T23:52:41-0600 lvl=warn msg="CGroup memory swap accounting is disabled, swap limits will be ignored."
t=2019-09-10T23:52:41-0600 lvl=info msg="Kernel features:"
t=2019-09-10T23:52:41-0600 lvl=info msg=" - netnsid-based network retrieval: no"
t=2019-09-10T23:52:41-0600 lvl=info msg=" - uevent injection: no"
t=2019-09-10T23:52:41-0600 lvl=info msg=" - seccomp listener: no"
t=2019-09-10T23:52:41-0600 lvl=info msg=" - unprivileged file capabilities: yes"
t=2019-09-10T23:52:41-0600 lvl=info msg=" - shiftfs support: no"
t=2019-09-10T23:52:41-0600 lvl=info msg="Initializing local database"
t=2019-09-10T23:52:41-0600 lvl=info msg="Starting /dev/lxd handler:"
t=2019-09-10T23:52:41-0600 lvl=info msg=" - binding devlxd socket" socket=/var/snap/lxd/common/lxd/devlxd/sock
t=2019-09-10T23:52:41-0600 lvl=info msg="REST API daemon:"
t=2019-09-10T23:52:41-0600 lvl=info msg=" - binding Unix socket" inherited=true socket=/var/snap/lxd/common/lxd/unix.socket
t=2019-09-10T23:52:41-0600 lvl=info msg="Initializing global database"
t=2019-09-10T23:52:41-0600 lvl=info msg="Initializing storage pools"
t=2019-09-10T23:52:41-0600 lvl=eror msg="Failed to start the daemon: ZFS storage pool \"lxd_zfs\" could not be imported: "
t=2019-09-10T23:52:41-0600 lvl=info msg="Starting shutdown sequence"
t=2019-09-10T23:52:41-0600 lvl=info msg="Stopping REST API handler:"
t=2019-09-10T23:52:41-0600 lvl=info msg=" - closing socket" socket=/var/snap/lxd/common/lxd/unix.socket
t=2019-09-10T23:52:41-0600 lvl=info msg="Stopping /dev/lxd handler:"
t=2019-09-10T23:52:41-0600 lvl=info msg=" - closing socket" socket=/var/snap/lxd/common/lxd/devlxd/sock
t=2019-09-10T23:52:41-0600 lvl=info msg="Closing the database"
t=2019-09-10T23:52:41-0600 lvl=info msg="Unmounting temporary filesystems"
t=2019-09-10T23:52:41-0600 lvl=info msg="Done unmounting temporary filesystems```

I’ve since found this issue and had high hopes that it might work for me. https://github.com/lxc/lxd/issues/4222

As it stands I’m still getting the cannot impot 'lxd_zfs': no such pool availble
I’m not sure if this is due to the fact that my lxd_zfs dir is located within the /var/snap/lxd/common/lxd/storage-pools/ dir or not.

Here is also the included dir structure in case it may provide anymore valuable insight.

% sudo tree /var/snap/lxd/common/lxd -a -L 2                                                                            (master%=) | 8:52 PM
/var/snap/lxd/common/lxd
├── backups
├── cache
│   ├── instance_types.yaml
│   └── simplestreams.yaml
├── containers
├── database
│   ├── global
│   ├── global.bak
│   ├── local.db
│   └── local.db.bak
├── devices
├── devlxd
├── disks
├── images
│   ├── 2dd611e2689a8efc45807bd2a86933cf2da0ffc768f57814724a73b5db499eac
│   └── 2dd611e2689a8efc45807bd2a86933cf2da0ffc768f57814724a73b5db499eac.rootfs
├── logs
│   ├── delugeContainer
│   ├── embyContainer
│   ├── lxd.log
│   ├── lxd.log.1
│   ├── lxd.log.2.gz
│   ├── lxd.log.3.gz
│   ├── lxd.log.4.gz
│   ├── lxd.log.5.gz
│   ├── lxd.log.6.gz
│   ├── lxd.log.7.gz
│   ├── ownCloud
│   ├── ownCloud2
│   ├── piHole
│   ├── plexContainer
│   └── unifiController
├── networks
├── security
├── server.crt
├── server.key
├── shmounts -> /var/snap/lxd/common/shmounts/containers
├── .shutdown
├── snapshots
│   ├── delugeContainer
│   ├── embyContainer
│   ├── plexContainer
│   └── unifiController
└── storage-pools
    ├── default
    └── lxd_zfs

24 directories, 22 files

Hi!

See this post,

The deb package was not completely removed; you have been using by accident the LXD client of the deb package with the LXD server of the snap package.

Hey @simos! I really appreciate the reply.

You are correct the lxd-client was not purged from the dep package. Per your linked post I went through the following steps:

sudo snap stop lxd
sudo snap remove lxd
sudo apt --purge --auto-remove lxd-client
sudo apt --purge --auto-remove lxd
sudo apt autoremove
sudo snap install lxd
sudo lxd init
lxc list

# Success, LXC is working again. 
# Now to try and replace the snap install with my backed up version

sudo snap stop lxd
sudo cp -R /mnt/extracted-lxd-dir/var/snap/lxd/common/lxd/ /var/sna/lxd/common/lxd/
sudo snap start lxd
lxc list

# LXC fails again
# Output appears to be the same but now referencing the proper install location
# Error: Get http://unix.socket/1.0: dial unix /var/snap/lxd/common/lxd/unix.socket: connect: no such file or directory

Any further thoughts?

Thanks!

Any further thoughts?

Thanks!

What does systemctl --failed and systemctl -a | grep lxd show?

You may also find some useful information in journalctl -u snap.lxd.daemon.unix.socket and journalctl -u snap.lxd.daemon.service.

@stgraber here is the requested output from the above commands. Something to take note is that I do not have a snap.lxd.daemon.unix.socket. This is what’s available from my autocomplete for the snap.lxd. journals.

~ % journalctl -u snap.lxd.   
snap.lxd.activate.service  snap.lxd.daemon.service
~ % systemctl --failed                                                                                                                     1:22 PM
  UNIT                        LOAD   ACTIVE SUB    DESCRIPTION
● snap.lxd.daemon.service     loaded failed failed Service for snap application lxd.daemon
● snap.lxd.daemon.unix.socket loaded failed failed Socket unix for snap application lxd.daemon

LOAD   = Reflects whether the unit definition was properly loaded.
ACTIVE = The high-level unit activation state, i.e. generalization of SUB.
SUB    = The low-level unit activation state, values depend on unit type.

2 loaded units listed. Pass --all to see loaded but inactive units, too.
To show all installed unit files use 'systemctl list-unit-files'.
~ % systemctl -a | grep lxd                                                                                                                1:22 PM
  run-snapd-ns-lxd.mnt.mount                                                                                                                          loaded    active   mounted   /run/snapd/ns/lxd.mnt
  snap-lxd-11964.mount                                                                                                                                loaded    active   mounted   Mount unit for lxd, revision 11964
  snap-lxd-11985.mount                                                                                                                                loaded    active   mounted   Mount unit for lxd, revision 11985
  snap.lxd.activate.service                                                                                                                           loaded    inactive dead      Service for snap application lxd.activate
● snap.lxd.daemon.service                                                                                                                             loaded    failed   failed    Service for snap application lxd.daemon
● snap.lxd.daemon.unix.socket                                                                                                                         loaded    failed   failed    Socket unix for snap application lxd.daemon
~ % journalctl -u snap.lxd.daemon.service -n 20                                                                                            1:25 PM
-- Logs begin at Tue 2019-09-03 10:16:31 MDT, end at Wed 2019-09-25 13:25:30 MDT. --
Sep 19 15:16:54 homeserver lxd.daemon[16779]: ==> Setting up LVM configuration
Sep 19 15:16:54 homeserver lxd.daemon[16779]: ==> Rotating logs
Sep 19 15:16:54 homeserver lxd.daemon[16779]: ==> Setting up ZFS (0.7)
Sep 19 15:16:54 homeserver lxd.daemon[16779]: ==> Escaping the systemd cgroups
Sep 19 15:16:54 homeserver lxd.daemon[16779]: ====> Detected cgroup V1
Sep 19 15:16:54 homeserver lxd.daemon[16779]: ==> Escaping the systemd process resource limits
Sep 19 15:16:54 homeserver lxd.daemon[16779]: ==> Disabling shiftfs on this kernel (auto)
Sep 19 15:16:54 homeserver lxd.daemon[16779]: => Re-using existing LXCFS
Sep 19 15:16:54 homeserver lxd.daemon[16779]: => Starting LXD
Sep 19 15:16:54 homeserver lxd.daemon[16779]: t=2019-09-19T15:16:54-0600 lvl=warn msg="CGroup memory swap accounting is disabled, swap limits will
Sep 19 15:16:54 homeserver lxd.daemon[16779]: t=2019-09-19T15:16:54-0600 lvl=eror msg="Failed to start the daemon: ZFS storage pool \"lxd_zfs\" cou
Sep 19 15:16:54 homeserver lxd.daemon[16779]: Error: ZFS storage pool "lxd_zfs" could not be imported:
Sep 19 15:16:55 homeserver systemd[1]: snap.lxd.daemon.service: Main process exited, code=exited, status=1/FAILURE
Sep 19 15:16:55 homeserver systemd[1]: snap.lxd.daemon.service: Failed with result 'exit-code'.
Sep 19 15:16:55 homeserver systemd[1]: snap.lxd.daemon.service: Service hold-off time over, scheduling restart.
Sep 19 15:16:55 homeserver systemd[1]: snap.lxd.daemon.service: Scheduled restart job, restart counter is at 5.
Sep 19 15:16:55 homeserver systemd[1]: Stopped Service for snap application lxd.daemon.
Sep 19 15:16:55 homeserver systemd[1]: snap.lxd.daemon.service: Start request repeated too quickly.
Sep 19 15:16:55 homeserver systemd[1]: snap.lxd.daemon.service: Failed with result 'exit-code'.
Sep 19 15:16:55 homeserver systemd[1]: Failed to start Service for snap application lxd.daemon.

Ok, you should try systemctl restart snap.lxd.daemon.unix.socket that should get the socket file back online. Then you can try lxc list see if maybe the daemon feels like starting this time.

Your log indicates that last startup attempt was about a week ago and the daemon wasn’t happy due to a missing ZFS pool, but maybe that got corrected since.

Doesn’t appear to work. I think at this point I’m best off just starting from scratch and trying to boot from the failing drive to make the appropriate images of all my containers. That lxd_zfs pool was created on my old install via lxd init and so I was expecting to just be able to drop the old install in on the new HDD but apparently that’s not quite the case.

Depends on what was backing that pool. If it was a file backed pool and you have the file in /var/snap/lxd/common/lxd/disks then that should be fine.

If it was backed by some physical disk/partition, then just restoring the content of /var/snap/lxd isn’t enough, you also need to get that zpool back in place somehow.