Hi everyone,
Thanks for LXC/D! I’m having an issue where our containers are not listing properly and other functions like snapshots are not working. We’re running Ubuntu 20.04 on the host and containers. encrypted+mirrored ZFS for the storage pool.
I’ve seemed to narrowed the issue to a database corruption on a specific container by doing the following:
lxc list
±-------------±--------±-------------------±--------------------------±----------±----------+
| container_name | ERROR | | | CONTAINER | 0 |
±-------------±--------±-------------------±--------------------------±----------±----------+
Let’s look at the logs:
sudo tail -f /var/snap/lxd/common/lxd/logs/lxd.log
t=2021-04-29T10:43:58-0700 lvl=eror msg="Failed to list instance snapshots" err="Failed to fetch instance_snapshots: id" instance=container_name project=default
A look at the specific container:
lxd sql global "SELECT id, instance_id, name, creation_date, stateful, description, expiry_date FROM instances_snapshots WHERE instance_id = 14" error: Got a row error: id
So I decided to open up sqlite to try and fine the problematic row:
sqlite3 /var/snap/lxd/common/lxd/database/global/db.bin
sqlite> SELECT id, instance_id, name, creation_date, stateful, description, expiry_date FROM instances_snapshots WHERE instance_id = 14 AND id > 2360 AND id < 2600;
2368|14|snapshot-65|2021-04-17 19:00:08.875615999+00:00|0||2021-05-17 12:00:08.874972385-07:00
2380|14|snapshot-66|2021-04-18 01:00:17.732227872+00:00|0||2021-05-17 18:00:17.731749164-07:00
Error: database disk image is malformed
Doing an integrity check reveals:
sqlite> pragma integrity_check;
*** in database main ***
On tree page 126 cell 65: Rowid 2359 out of order
On tree page 43 cell 106: Rowid 71835 out of order
On tree page 43 cell 97: Rowid 71291 out of order
On tree page 43 cell 91: Rowid 70960 out of order
On tree page 43 cell 90: Rowid 70911 out of order
On tree page 43 cell 79: Rowid 70272 out of order
On tree page 43 cell 76: Rowid 70097 out of order
On tree page 43 cell 51: Rowid 63037 out of order
On tree page 43 cell 50: Rowid 62740 out of order
On tree page 354 cell 51: Rowid 68748 out of order
row 66 missing from index sqlite_autoindex_instances_snapshots_1
wrong # of entries in index sqlite_autoindex_instances_snapshots_1
row 66 missing from index sqlite_autoindex_storage_volumes_snapshots_2
row 66 missing from index sqlite_autoindex_storage_volumes_snapshots_1
row 93 missing from index sqlite_autoindex_storage_volumes_snapshots_2
--- continues ----
row 241 missing from index sqlite_autoindex_storage_volumes_snapshots_2
row 243 missing from index sqlite_autoindex_storage_volumes_snapshots_2
row 248 missing from index sqlite_autoindex_storage_volumes_snapshots_2
Any advice on how to recover/repair/rebuild the database? There are more snapshots for that container, but the db seems corrupted and out of sync. Thanks!