Can't create snapshot with the same name after delete it

How reproduce:
Creating snapshot with new name

$ lxc snapshot ns backup
$ lxc info ns
Name: ns
Location: my
Remote: unix://
Architecture: x86_64
Created: 2019/08/20 05:29 UTC
Status: Stopped
Type: container
Profiles: default
Snapshots:
  backup (taken at 2020/04/12 21:48 UTC) (stateless)

Delete it and try create again with the same name:

$ lxc delete ns/backup
$ lxc snapshot ns backup
Error: Create instance: Insert volume snapshot: UNIQUE constraint failed: storage_volumes_snapshots.storage_volume_id, storage_volumes_snapshots.name

P.S. I see that after delete command it wasn’t delete for “storage_volumes_snapshots” table:

# lxd sql global "SELECT * FROM storage_volumes_snapshots;"
+-----+-------------------+--------+-------------+---------------------------+
| id  | storage_volume_id |  name  | description |        expiry_date        |
+-----+-------------------+--------+-------------+---------------------------+
| 187 | 7                 | backup |             | 0001-01-01T02:02:04+02:02 |
| 199 | 21                | backup |             | 0001-01-01T02:02:04+02:02 |
+-----+-------------------+--------+-------------+---------------------------+

# lxd sql global "SELECT * FROM storage_volumes;" | grep 21
| 21  | ns                                                               | 1               | 3       | 0    |             | 1          |

LXD version:

# snap list
Name  Version    Rev    Tracking       Publisher   Notes
core  16-2.44.1  8935   latest/stable  canonical✓  core
lxd   4.0.0      14503  latest/stable  canonical✓  -
stgraber@castiana:~$ lxc snapshot b1 blah
stgraber@castiana:~$ lxc delete b1/blah
stgraber@castiana:~$ lxc snapshot b1 blah
stgraber@castiana:~$ 

What storage backend are you on and are you using projects?

btrfs

$ lxc storage list
+-------+-------------+--------+---------+---------+
| NAME  | DESCRIPTION | DRIVER |  STATE  | USED BY |
+-------+-------------+--------+---------+---------+
| local |             | btrfs  | CREATED | 12      |
+-------+-------------+--------+---------+---------+

It’s happened after upgrade cluster from 3.22 to 4.0.0

Does that only happen with pre-upgrade snapshots or only with newly created ones?

Seems this is happening after remove pre-upgraded snapshot, then it’s creating without issues, then after remove it again and create - I see this issue

Steps to reproduce:
Check “storage_volumes_snapshots”

# lxd sql global "SELECT * FROM storage_volumes_snapshots;"
+-----+-------------------+--------+-------------+---------------------------+
| id  | storage_volume_id |  name  | description |        expiry_date        |
+-----+-------------------+--------+-------------+---------------------------+
| 187 | 7                 | backup |             | 0001-01-01T02:02:04+02:02 |
| 199 | 21                | backup |             | 0001-01-01T02:02:04+02:02 |
| 200 | 5                 | backup |             | 0001-01-01T02:02:04+02:02 |
| 201 | 33                | backup |             | 0001-01-01T02:02:04+02:02 |
+-----+-------------------+--------+-------------+---------------------------+

Note: very strange that I don’t see here snapshots which were created in 3.22 version (this is about 10 containters)

Delete and create snapshot for container which was created in 3.22 version:

$ lxc delete ns01/backup
$ lxc snapshot ns01 backup

It’s added to “storage_volumes_snapshots” (storage_volume_id=6)

# lxd sql global "SELECT * FROM storage_volumes_snapshots;"
+-----+-------------------+--------+-------------+---------------------------+
| id  | storage_volume_id |  name  | description |        expiry_date        |
+-----+-------------------+--------+-------------+---------------------------+
| 187 | 7                 | backup |             | 0001-01-01T02:02:04+02:02 |
| 199 | 21                | backup |             | 0001-01-01T02:02:04+02:02 |
| 200 | 5                 | backup |             | 0001-01-01T02:02:04+02:02 |
| 201 | 33                | backup |             | 0001-01-01T02:02:04+02:02 |
| 202 | 6                 | backup |             | 0001-01-01T02:02:04+02:02 |
+-----+-------------------+--------+-------------+---------------------------+

Trying delete it and create again:

$ lxc delete ns01/backup
$ lxc snapshot ns01 backup
Error: Create instance: Insert volume snapshot: UNIQUE constraint failed: storage_volumes_snapshots.storage_volume_id, storage_volumes_snapshots.name

And from table “storage_volumes_snapshots” it’s not deleted.

Is that using the snap or some other 4.0.0 package?

We fixed a few issues related to this since 4.0 so I want to make sure that you do have those fixes before we spend much more time on this.

Yes, this is snap rev=14503.
But I also checked on non-cluster version (arm64, revision), no issue… hm… now I not sure that this is after 3.22 upgrade, maybe happened earlier…

Oh, on 2nd cluster I see the same issue, also upgraded from 3.22 version.

Hmm, I don’t see why clusters would be more impacted than standalone.
Were the clusters upgraded before the standalone system?

@freeekanayaka

No, clusters was deployed as clusters from initial setup

I see the the same issue after updating from 3.0/stable via snap refresh lxd --channel=4.0/stable some days ago. I do not use the cluster setup.

It seems to me that database table “storage_volumes_snapshots” was introduced in 4.0 and a migration of snapshots from table “storage_volumes” did not happen properly.

There is a nightly script that removes an old container snapshot named “backup” and creates a new one with the same name; this script now fails with

Error: Failed to fetch snapshot "backup" of instance "iobroker" in project "default": No such object
Error: Failed creating instance snapshot record "iobroker/backup": Failed initialising instance: Failed creating storage record for snapshot: Insert volume snapshot: UNIQUE constraint failed: storage_volumes_snapshots.storage_volume_id, storage_volumes_snapshots.name

In the following tables I see

  • snapshots that were not migrated to storage_volumes_snapshots, e.g. iobroker/after-zwave-usb
  • snapshots with the same name in old/new persistence, see 570=iobroker/backup and 7=iobroker/575=backup

# lxd sql global "SELECT * FROM storage_volumes;"
+-----+---------------------------------+-----------------+---------+------+-------------+------------+
| id  |              name               | storage_pool_id | node_id | type | description | project_id |
+-----+---------------------------------+-----------------+---------+------+-------------+------------+
| 7   | iobroker                        | 1               | 1       | 0    |             | 1          |
| 11  | iobroker/pre-iobroker-upgrade   | 1               | 1       | 0    |             | 1          |
| 13  | iobroker/after-zwave-usb        | 1               | 1       | 0    |             | 1          |
| 570 | iobroker/backup                 | 1               | 1       | 0    |             | 1          |
...

# lxd sql global "SELECT * FROM storage_volumes_snapshots;"
+-----+-------------------+------------------------------+-------------+---------------------------+
| id  | storage_volume_id |             name             | description |        expiry_date        |
+-----+-------------------+------------------------------+-------------+---------------------------+
| 575 | 7                 | backup                       |             | 0001-01-01T00:53:28+00:53 |
...

What’s the right approach to manually migrate/fix this situation?

First you’d need to check what snapshots actually exist on disk to get an idea of jsut how wrong those tables are.

The NAME/SNAPSHOT entries in storage_volumes will need to go away but you should also check that there’s nothing missing in the storage_volume_snapshots table.

Thanks for your advice, I migrated the database records accordingly and snapshots can be used again now.

Just for the record:

With a zfs storage backend, use

# zfs list -t snapshot

to take a look at existing snapshots.

Use

# lxd sql global "SELECT * FROM storage_volumes;"
# lxd sql global "SELECT * FROM storage_volumes_snapshots;"

to see existing snapshots that may need to be migrated from storage_volumes to storage_volumes_snapshots.

Use e.g.

# lxd sql global "INSERT INTO storage_volumes_snapshots VALUES(596, 7, 'after-zwave-usb', '', '0001-01-01T00:53:28+00:53');"
# lxd sql global "DELETE FROM storage_volumes WHERE id IN (...);"
# lxd sql global "DELETE FROM storage_volumes_snapshots WHERE id IN (...);"

to cleanup the tables.

Use

# lxd sql global "UPDATE SQLITE_SEQUENCE SET SEQ=599 WHERE NAME='storage_volumes';"

to set the auto increment key sequence to a new value.

The following command creates a list of SQL statements for the database migration of snapshots from LXD v3.x to v4.x:

lxd sql global "SELECT 'DELETE FROM storage_volumes WHERE id IN (' || id || ')' as delstat, 'INSERT INTO storage_volumes_snapshots VALUES(' || id || ',' || (SELECT id FROM storage_volumes WHERE name=contname) || ',''' ||  contsnapshot || ''', '''', ''0001-01-01T00:53:28+00:53'')' as insstat FROM (SELECT id, substr(name, 0, instr(name, '/')) as contname, substr(name, instr(name, '/')+1) as contsnapshot FROM storage_volumes where contname!='');"

The printed statements need to be executed after an lxd update via e.g. snap refresh lxd --channel=4.0/stable, sample:

lxd sql global "DELETE FROM storage_volumes WHERE id IN (836); INSERT INTO storage_volumes_snapshots VALUES(836,65,'backup', '', '0001-01-01T00:53:28+00:53');"