Lxc snapshot and lxc start Error: Instance snapshot record count doesn't match instance snapshot volume record count

Yeah, ok, then maybe a --repair-flag would be the thing?
If this happens in a production environment they is maybe to much sweat around to manually repair the db entries, if you never did this before.
At least the error message should have more details, where to look and take action to repair this?

BTW I have no clue how to repair this. I just reverted to 5.1, so if I update to 5.2 again I have the problem again. And also I have no clue why instance and volume count drift apart.

How old are the problem containers? It maybe a fault crept in a while back. This is all conjecture currently as I’m not at my pc.

This extra new consistency check when generating the start time backup.yaml file is the cause of the error

Its new in lxd 5.2.

But the actual record mismatch is likely to have existed more recently, but I’ll double check the record cleanup logic on snapshot failure you described above.

Doing lxc delete instance/snapshot for the snapshots with missing volume db records should fix it and bring it inline. If that is acceptable to lose those snapshots.

Don’t just delete the problem db records otherwise you’ll leave the actual snapshots orphaned on disk.

Alternatively we will have to craft a custom insert statement to restore the missing volume db record.

(coming from Can't start containers - Error: Instance snapshot record count doesn't match instance snapshot volume record count)

Going by creation_date, in our case all recent(=>2021-09-09) containers startup fine, but the old ones(=<2021-08-04) all have issues.

Where the old containers are supposed to have 8 backups, on one container I just checked we’ve 23, going all the way back to 2021-11. Other containers are going back to 2021-08 etc.

lxc deleting snapshots does not work with the same exact error.

# lxc delete cont/autosnapshot-20220225-100052                            
Error: Instance snapshot record count doesn't match instance snapshot volume record count

The actual original error looks like a strange path for your instance

/var/snap/lxd/common/shmounts/storage-pools/default/containers/container-name

The shmounts part is strange and looks out of place.

Can you show ‘lxc storage show default’ please.

Hrm I’ll probably have to put a lxd startup db patch in to create db record entries for the missing snapshot volume records. Or something in that backup generator as really don’t want to be dealing with an inconsistent database or backup file (kind of defeats the purpose of it otherwise).

It suggests at some point the snapshot operation was not creating storage volume db records in certain scenarios.

Im not following what you’re meaning is here?

# btrfs subvolume list /var/snap/lxd/common/lxd | grep /container/
Shows many more snapshots than lxc info

So this issue isn’t related to any orphaned on disk snapshots, its only concerned with the difference between instances_snapshots and storage_volumes tables

See Lxc snapshot and lxc start Error: Instance snapshot record count doesn't match instance snapshot volume record count - #2 by robe2

Indeed, got it working even on 4.2, thanks a lot everyone!

1 Like

lxc storage show default

Is quite long. FWIW, the ones I did check that have this issue the counts were off and they are fairly old containers I’ve had for a while that are set to daily snapshot with 30 day retention.

Done - Error: Instance snapshot record count doesn't match instance snapshot volume record count after upgrade to lxd 5.2 · Issue #10501 · lxc/lxd · GitHub

1 Like

Yes let’s see it please

Much obliged thank you

This worked, but it shut down all the containers and started them back up in the process. It also took like 10 minutes or more.

So definitely not something you want to do if you can’t have any downtime.

Sadly not sure it would be of any value now that I downgraded to 5.1. Let me know if it’s still useful.

I just want to see what the source property is as it shouldnt be trying to use /var/snap/lxd/common/shmounts for your storage pool mount, which could indicate a problem with the snap package.

osgeo7 is the name of the zpool

zpool list

NAME     SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP    HEALTH  ALTROOT
osgeo7  14.5T  2.68T  11.8T        -         -    27%    18%  1.00x    ONLINE  -

 lxc storage show default
config:
  source: osgeo7
  volatile.initial_source: osgeo7
  zfs.pool_name: osgeo7
description: ""
name: default
driver: zfs
used_by: