I have a machine where I’m unable to start a couple of specific containers, I’m unable to list containers at all, and several internal functions of LXD are broken (snapshot expiries, automated snapshots, etc). I’m suspecting database corruption but I can’t see it:
t=2020-11-03T22:15:50+0000 lvl=eror msg="Problem loading instances list" err="Failed to fetch instances: id" t=2020-11-03T22:17:50+0000 lvl=eror msg="Failed to load containers for scheduled snapshots" err="Failed to fetch instances: id" t=2020-11-03T22:17:50+0000 lvl=eror msg="Failed to load instances for snapshot expiry" err="Failed to fetch instances: id" t=2020-11-03T22:18:50+0000 lvl=eror msg="Failed to load instances for snapshot expiry" err="Failed to fetch instances: id"
When I go to list containers with
--fast, I get
Error: name as the only response. When I go to start one of the broken containers, I get:
Error: Common start logic: Failed to get snapshots: Failed to fetch instance_snapshots: id
I get a message about using
--show-log but there’s nothing at all useful in there (output is the same as lxc info without --show-log).
I can dump the entire global database without issue. I can distill it down to the following query which fails when specified with one of the broken container names, but works (at least, returns an empty set) for other containers:
lxd sql global "SELECT instances_snapshots.id, projects.name AS project, instances.name AS instance, instances_snapshots.name, instances_snapshots.creation_date, instances_snapshots.stateful, coalesce(instances_snapshots.description, ''), instances_snapshots.expiry_date FROM instances_snapshots JOIN projects ON instances.project_id = projects.id JOIN instances ON instances_snapshots.instance_id = instances.id WHERE instance = 'BROKEN_CONTAINER_HERE' ORDER BY projects.id, instances.id, instances_snapshots.name" Error: Failed to execute query: id
Adding --debug to this command doesn’t do anything helpful except repeat the query back to me.