Not listing containers after restatring

root@cpu-5174:~# fuser -u /var/lib/lxd/unix.socket
/var/lib/lxd/unix.socket: 20744(root)

here is the output

Pardon my typing laziness, I am using voice on my phone to respond to this because I’m on my way to a client to explain to him why his server is not working. And I have four other servers with the same problem. And they all happen after reboot after an upgrade. Interestingly, I found two snap installations in the servers with a problem. Perhaps since the snap installation is standard with the new upgrade is bringing it in anyway.
This dual lxd installation may be the cause of the problem. But no one installed it, this was on servers that have been working for years.until the apt upgrade broke it.

root@cpu-5174:~# ps aux | grep 20744
root 20744 0.0 0.0 1142996 36788 pts/1 Sl 12:46 0:00 /root/go/bin/lxd --logfile /var/log/lxd/lxd.log

its running from the latest version lxd

one more issue
when I create a new container for a test
root@cpu-blabla:~# lxc launch images:verygames-debian-9 test
Creating test
Error: Failed container creation: No storage pool found. Please create a new storage pool

there is already a storage pool is existing

root@cpu-5174:~# zfs list default
default 171G 703G 19K none

I have a request here, and this request includes myself as a recipient.
@arunksasi has an issue with figuring out how to get back the containers.
Let’s focus on that. We can talk about Debian packaging in another thread, or after his case is resolved.

@arunksasi, the case is that you have LXD 3.8 (compiled from source) on Debian, and at some point recently you lost access to the containers. When you run lxc list, you get an empty list, as if this is a brand new installation of LXD.

When you run zfs list, you can see that the containers are showing up (their storage). This is good.

I wonder whether the containers are actually running. Can you run the following? It should show the host’s /sbin/init, and if there is a container that is running, there should be a few more /sbin/init.

ps aux | grep /sbin/init

Next, we want to figure out what does LXD know about the storage. Most likely, LXD forgot all about the storage, but we need to verify. The will show whether there is a storage, and what name it is. Let’s say it is lxd.

$ lxc storage list

Then, get some info on this storage driver,

lxc storage info lxd

After you perform these, we will know whether LXD has access to the ZFS storage pool, or whether the ZFS storage pool is orphaned. And if it is orphaned, we will try to figure out how to reconnect to LXD.

root@cpu-blabla:~# ps aux | grep /sbin/init 
root         1  0.0  0.0  57444  7064 ?        Ss   10:45   0:03 /sbin/init noquiet nosplash
root     29767  0.0  0.0  12780   944 pts/0    S+   13:45   0:00 grep /sbin/init
Error: not found

these are the outputs

storage pool got disappeared :disappointed:

Here is a discussion about reconnecting the lost storage pool back into LXD,

I would suggest to first look into /var/lib/lxd/ and try to find if there is some backup of the LXD configuration.
The pathname could be /var/lib/lxd/database/. There should be some .bak files and directories in there.
One quick and dirty attempt to bring up the storage pool, would be to (untested by me):

  1. Take a backup of all the content in /var/lib/lxd/database/.
  2. Stop LXD.
  3. Restore the backup (i.e. cp /var/lib/lxd/database/local.db.bak /var/lib/lxd/database/local.db`, and the same with the directory).
  4. Start LXD.
1 Like
Starting lxd
root@cpu-blabla:/var/lib/lxd/database# WARN[05-16|14:15:36] AppArmor support has been disabled because of lack of kernel support 
EROR[05-16|14:15:36] Failed to start the daemon: Failed to migrate data to global database: failed to insert row 1 into config: UNIQUE constraint failed: config.key 
WARN[05-16|14:15:36] Failed to dump database to disk: failed to write database file: open /var/lib/lxd/database/global/db.bin: no such file or directory 
Error: Failed to migrate data to global database: failed to insert row 1 into config: UNIQUE constraint failed: config.key```

tried the steps but getting above error

now it showing the containers and its not starting

``root@cpu-blabla:/var/lib/lxd/database# lxc list
| vm620353 | STOPPED | | | PERSISTENT | |
| vm629113 | STOPPED | | | PERSISTENT | |
| vm638724 | STOPPED | | | PERSISTENT | |

```root@cpu-blabla:/var/lib/lxd/database# lxc start  vm620353
EROR[05-16|14:26:05] The stop hook failed                     container=vm620353 err="Container is already running a start operation"
EROR[05-16|14:26:05] Failed starting container                used=2019-05-16T14:25:14+0000 stateful=false project=default name=vm620353 action=start created=2019-05-15T13:01:52+0000 ephemeral=false
Error: Failed to run: /root/go/bin/lxd forkstart vm620353 /var/lib/lxd/containers /var/log/lxd/vm620353/lxc.conf: ```

finally Its started :slight_smile:

Containers are showing now

|   NAME   |  STATE  |         IPV4          |                     IPV6                      |    TYPE    | SNAPSHOTS |
| vm620353 | RUNNING |XXXXX (eth0)    | fd42:639a:4075:9569:216:3eff:fe9c:716b (eth0) | PERSISTENT |           |
| vm629113 | RUNNING | XXXXX (eth0) | fd42:639a:4075:9569:216:3eff:feac:d4bf (eth0) | PERSISTENT |           |
| vm638724 | RUNNING |XXXXXX (eth0) | fd42:639a:4075:9569:216:3eff:fea9:78e1 (eth0) | PERSISTENT |           |

Thank you for your support @simos