Failed to start the daemon: failed to open cluster database: force initial raft log replay: driver: bad connection

Hi,

I have a system where lxd won’t start. It looks like the global database is broken.
Any idea how to fix this?

# tail -f /var/snap/lxd/common/lxd/logs/lxd.log
t=2020-02-21T07:08:51+0100 lvl=info msg="LXD 3.21 is starting in normal mode" path=/var/snap/lxd/common/lxd
t=2020-02-21T07:08:51+0100 lvl=info msg="Kernel uid/gid map:" 
t=2020-02-21T07:08:51+0100 lvl=info msg=" - u 0 0 4294967295" 
t=2020-02-21T07:08:51+0100 lvl=info msg=" - g 0 0 4294967295" 
t=2020-02-21T07:08:51+0100 lvl=info msg="Configured LXD uid/gid map:" 
t=2020-02-21T07:08:51+0100 lvl=info msg=" - u 0 1000000 1000000000" 
t=2020-02-21T07:08:51+0100 lvl=info msg=" - g 0 1000000 1000000000" 
t=2020-02-21T07:08:51+0100 lvl=info msg="Kernel features:" 
t=2020-02-21T07:08:51+0100 lvl=info msg=" - netnsid-based network retrieval: no" 
t=2020-02-21T07:08:51+0100 lvl=info msg=" - uevent injection: no" 
t=2020-02-21T07:08:51+0100 lvl=info msg=" - seccomp listener: no" 
t=2020-02-21T07:08:51+0100 lvl=info msg=" - seccomp listener continue syscalls: no" 
t=2020-02-21T07:08:51+0100 lvl=info msg=" - unprivileged file capabilities: yes" 
t=2020-02-21T07:08:51+0100 lvl=info msg=" - cgroup layout: hybrid" 
t=2020-02-21T07:08:51+0100 lvl=warn msg=" - Couldn't find the CGroup memory swap accounting, swap limits will be ignored" 
t=2020-02-21T07:08:51+0100 lvl=info msg=" - shiftfs support: disabled" 
t=2020-02-21T07:08:51+0100 lvl=info msg="Initializing local database" 
t=2020-02-21T07:08:52+0100 lvl=info msg="Starting /dev/lxd handler:" 
t=2020-02-21T07:08:52+0100 lvl=info msg=" - binding devlxd socket" socket=/var/snap/lxd/common/lxd/devlxd/sock
t=2020-02-21T07:08:52+0100 lvl=info msg="REST API daemon:" 
t=2020-02-21T07:08:52+0100 lvl=info msg=" - binding Unix socket" inherited=true socket=/var/snap/lxd/common/lxd/unix.socket
t=2020-02-21T07:08:52+0100 lvl=info msg=" - binding TCP socket" socket=[::]:8443
t=2020-02-21T07:08:52+0100 lvl=info msg="Initializing global database" 
t=2020-02-21T07:09:02+0100 lvl=eror msg="Failed to start the daemon: failed to open cluster database: force initial raft log replay: driver: bad connection" 
t=2020-02-21T07:09:02+0100 lvl=info msg="Starting shutdown sequence" 
t=2020-02-21T07:09:02+0100 lvl=info msg="Stopping REST API handler:" 
t=2020-02-21T07:09:02+0100 lvl=info msg=" - closing socket" socket=[::]:8443
t=2020-02-21T07:09:02+0100 lvl=info msg=" - closing socket" socket=/var/snap/lxd/common/lxd/unix.socket
t=2020-02-21T07:09:02+0100 lvl=info msg="Stopping /dev/lxd handler:" 
t=2020-02-21T07:09:02+0100 lvl=info msg=" - closing socket" socket=/var/snap/lxd/common/lxd/devlxd/sock

Hello,

first time we see this I think, might be a regression. Is this a standalone node? Would it be possible for you to mail me a tarball of /var/snap/lxd/common/lxd/database? free.ekanayaka@canonical.com

It is a standalone node. Just sent the tarball.

Okay, thanks. I’m not able to reproduce the issue here. So there’s something odd going on. What architecture are you running?

lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 18.04.3 LTS
Release: 18.04
Codename: bionic

With zfs

dpkg --print-architecture ? amd64?

Yep. amd64

Not sure it’s gonna help, but can you try to enable debug logging?

sudo snap set lxd daemon.debug=true

and paste here the last lines of /var/snap/lxd/common/lxd/logs/lxd.log.

t=2020-02-21T10:49:32+0100 lvl=dbug msg="Not unmounting temporary filesystems (containers are still running)" 
t=2020-02-21T10:49:33+0100 lvl=info msg="LXD 3.21 is starting in normal mode" path=/var/snap/lxd/common/lxd
t=2020-02-21T10:49:33+0100 lvl=info msg="Kernel uid/gid map:" 
t=2020-02-21T10:49:33+0100 lvl=info msg=" - u 0 0 4294967295" 
t=2020-02-21T10:49:33+0100 lvl=info msg=" - g 0 0 4294967295" 
t=2020-02-21T10:49:33+0100 lvl=info msg="Configured LXD uid/gid map:" 
t=2020-02-21T10:49:33+0100 lvl=info msg=" - u 0 1000000 1000000000" 
t=2020-02-21T10:49:33+0100 lvl=info msg=" - g 0 1000000 1000000000" 
t=2020-02-21T10:49:33+0100 lvl=info msg="Kernel features:" 
t=2020-02-21T10:49:33+0100 lvl=info msg=" - netnsid-based network retrieval: no" 
t=2020-02-21T10:49:33+0100 lvl=info msg=" - uevent injection: no" 
t=2020-02-21T10:49:33+0100 lvl=info msg=" - seccomp listener: no" 
t=2020-02-21T10:49:33+0100 lvl=info msg=" - seccomp listener continue syscalls: no" 
t=2020-02-21T10:49:33+0100 lvl=info msg=" - unprivileged file capabilities: yes" 
t=2020-02-21T10:49:33+0100 lvl=info msg=" - cgroup layout: hybrid" 
t=2020-02-21T10:49:33+0100 lvl=warn msg=" - Couldn't find the CGroup memory swap accounting, swap limits will be ignored" 
t=2020-02-21T10:49:33+0100 lvl=info msg=" - shiftfs support: disabled" 
t=2020-02-21T10:49:33+0100 lvl=info msg="Initializing local database" 
t=2020-02-21T10:49:33+0100 lvl=dbug msg="Initializing database gateway" 
t=2020-02-21T10:49:33+0100 lvl=dbug msg="Start database node" address= id=1 role=voter
t=2020-02-21T10:49:34+0100 lvl=info msg="Starting /dev/lxd handler:" 
t=2020-02-21T10:49:34+0100 lvl=info msg=" - binding devlxd socket" socket=/var/snap/lxd/common/lxd/devlxd/sock
t=2020-02-21T10:49:34+0100 lvl=info msg="REST API daemon:" 
t=2020-02-21T10:49:34+0100 lvl=info msg=" - binding Unix socket" inherited=true socket=/var/snap/lxd/common/lxd/unix.socket
t=2020-02-21T10:49:34+0100 lvl=info msg=" - binding TCP socket" socket=[::]:8443
t=2020-02-21T10:49:34+0100 lvl=info msg="Initializing global database" 
t=2020-02-21T10:49:34+0100 lvl=dbug msg="Dqlite: connected address=1 id=1 attempt=0" 
t=2020-02-21T10:49:44+0100 lvl=eror msg="Failed to start the daemon: failed to open cluster database: force initial raft log replay: driver: bad connection" 
t=2020-02-21T10:49:44+0100 lvl=info msg="Starting shutdown sequence" 
t=2020-02-21T10:49:44+0100 lvl=info msg="Stopping REST API handler:" 
t=2020-02-21T10:49:44+0100 lvl=info msg=" - closing socket" socket=[::]:8443
t=2020-02-21T10:49:44+0100 lvl=info msg=" - closing socket" socket=/var/snap/lxd/common/lxd/unix.socket
t=2020-02-21T10:49:44+0100 lvl=info msg="Stopping /dev/lxd handler:" 
t=2020-02-21T10:49:44+0100 lvl=info msg=" - closing socket" socket=/var/snap/lxd/common/lxd/devlxd/sock
t=2020-02-21T10:49:44+0100 lvl=dbug msg="Stop database gateway"

Somehow this problem disappeard at
t=2020-02-21T19:00:00+0100