"ZFS doesn't support restoring from snapshots other than the latest one"

danboid · February 1, 2021, 1:49pm

Within the ZFS section of https://lxd.readthedocs.io/en/stable-4.0/storage/ we have the line:

“ZFS doesn’t support restoring from snapshots other than the latest one”

Is this a current LXD limitation because its not a usually problem for ZFS. Under usual circumstances you can restore any previous ZFS snapshot but you will lose all of the snapshots inbetween your rollback snapshot target and the latest snapshot when you run the rollback command.

tomp · February 1, 2021, 1:55pm

That feels like its some out of date docs. By default LXD will prevent that as its a safety feature to prevent unexpected deletion, as other storage drivers don’t have that behaviour.

You can override it by setting this on the storage pool:

lxc storage set <zfs pool> volume.zfs.remove_snapshots=true

See

volume.zfs.remove_snapshots 	bool 	zfs driver 	false 	storage 	Remove snapshots as needed

From https://linuxcontainers.org/lxd/docs/stable-4.0/storage

BTW, the current URL for docs is:

https://linuxcontainers.org/lxd/docs/stable-4.0/

The one you’re using is still the same, but may not be (or may not work) in the future.

@stgraber should we remove that statement?

stgraber · February 1, 2021, 1:57pm

Well, the doc is correct, ZFS does not let you do it.

root@castiana:~# zfs create castiana/blah
root@castiana:~# zfs snapshot castiana/blah@snap1
root@castiana:~# zfs snapshot castiana/blah@snap2
root@castiana:~# zfs snapshot castiana/blah@snap3
root@castiana:~# zfs rollback castiana/blah@snap1
cannot rollback to 'castiana/blah@snap1': more recent snapshots or bookmarks exist
use '-r' to force deletion of the following snapshots and bookmarks:
castiana/blah@snap3
castiana/blah@snap2

So you can either manually delete the more recent snapshots or set the config flag so LXD does it for you. Things get very problematic if you have copied the container into a new one at some point as that will also create a snapshot but one which cannot be deleted without also destroying descendant containers.

danboid · February 1, 2021, 2:04pm

I have to disagree, I think the current text is misleading because as your console output correctly notes you can either use -r to do this on the command line or change the lxd flag that tomp mentioned and then it is possible to rollback to earlier snapshots.

This is an extremely important feature for me and this would’ve put me off using LXD if I had not known better.

tomp · February 1, 2021, 2:07pm

Perhaps we could add an additional clarification that LXD can remove the intermediate snapshots automatically and reference to that option.

danboid · February 1, 2021, 2:08pm

Yes, that would be better.

Thanks

stgraber · February 1, 2021, 2:09pm

Yeah, I’m trying to add something for that, but we have to be careful, as I mentioned, if the container was ever copied, this isn’t an option anymore because we end up with snapshots that cannot be safely deleted anymore.

tomp · February 1, 2021, 2:11pm

Yeah equally we don’t want to give the expectation that one can always restore a snapshot if thats not the case either (without deleting intermediates or preventing it happening entirely due to the copy scenario you mentioned).

stgraber · February 1, 2021, 2:13pm

Jimbo · February 3, 2021, 9:50am

Well, I am a little bit confused, as on my test machine I am running LXD with ZFS and I CAN restore snapshot number 2/4 without any problems, and I did not set any custom settings.

Previously I reported that the snapshots taken, continue to increase in size long after the API reports the process as successful, just now I tried to speed up things up, I took a snapshot of Ubuntu container with Apache whilst the container was stopped, it appears that the snapshot process only seemed to start (the initial record 88kb was created) once I started the container back up.

Last night after reading this thread I made a copy of the ZFS container with snapshots, and then tried to restore a snapshot on the original, and started getting errors because ZFS was reporting that stuff was being done on the COPY.

So I think LXD should report the process has not complete rather than complete to prevent these errors.

tomp · February 3, 2021, 10:01am

You’d need to provide full reproducer steps (on a fresh container not an existing one) to show how you are able to restore an intermediate snapshot, as this is a ZFS limitation not a LXD one. So to be able to see what is going on we’d need to see each command step by step for a freshly created container.

You didn’t say which operation was failing with errors after the copy, but as @stgraber mentioned earlier, copying a container will create a snapshots of the original source volume for the copy and so will prevent safe deletion:

Also, as I explained in the past, the snapshot process is never really complete, because it continues to track differences between the original state of the volume when the snapshot was taken and changes that occur after that. (which is why the size of the snapshot changes). This is a ZFS behavior not a LXD one.

You may be more comfortable using the ZFS storage pool option zfs.clone_copy=rebase. This causes instance copies to be based on the initial image snapshot rather than a snapshot of the source. So should help decoupled copies of instances from each other. You can also use zfs.clone_copy=false which will perform a full copy of the dataset.

See Linux Containers - LXD - Has been moved to Canonical

Please can you provide examples of the errors you are seeing with reproducer steps otherwise its difficult to understand what you’re seeing.

Jimbo · February 3, 2021, 10:38am

Silly me, I just realised that last year I added a custom code to my restore process which removes subsequent snapshots before the restore process to prevent these kind of issues when using ZFS as at the time volume.zfs.remove_snapshots=true was not supported. It might interesting to also add version limitation to docs when it is written.

As for the error on restoring a snapshot on the ORIGINAL container after running a copy.

Snapshot \"container-test-20210203-02\" cannot be restored due to subsequent internal snapshot(s) (from a copy)",

This is what I sent to the API, as you can see it does not copy snapshots, but now I can’t restore using snapshots on the original container anymore due to error above. This seems to be a problem, anyway round this?

{
    "name": "container-clone",
    "architecture": "aarch64",
    "type": "container",
    "profiles": [
        "custom-default",
        "custom-nat"
    ],
    "config": {
        "image.architecture": "arm64",
        "image.description": "Ubuntu focal arm64 (20210120_07:42)",
        "image.os": "Ubuntu",
        "image.release": "focal",
        "image.serial": "20210120_07:42",
        "image.type": "squashfs",
        "limits.cpu": "1",
        "limits.memory": "1GB",
        "volatile.base_image": "766788f3eb910d209469ccb48109d3236d1bf60897bb2bf52e5d14e12a5a2a3d"
    },
    "source": {
        "type": "copy",
        "certificate": null,
        "base-image": "766788f3eb910d209469ccb48109d3236d1bf60897bb2bf52e5d14e12a5a2a3d",
        "source": "container-test",
        "live": false,
        "instance_only": true
    },
    "devices": {
        "root": {
            "path": "/",
            "pool": "default",
            "size": "5GB",
            "type": "disk"
        }
    },
    "ephemeral": false,
    "stateful": false
}

tomp · February 3, 2021, 10:49am

I’d expect you’re hitting a problem because your container copy (even without copying snapshots) is still dependent on the source snapshots because they form a hierarchical tree of changes (i.e source image snapshot -> oldest source snapshot -> newer source snapshot -> source instance volume -> copy of source instance volume).

Using zfs.clone_copy=rebase should help here (but a new copy will need to be made), as it will rebase the copied instance volumes ontop of the original source image rather than the source instance (but this will use more storage space).

So it would become: source image snapshot -> copy of source instance volume.

But with all of the differences between source image snapshot and source instance volume having been copied into the copy of source volume (taking up more space).

Jimbo · February 3, 2021, 11:07am

I ran lxc storage set default zfs.clone_copy false, and I can confirm, if I copy a container, I can now restore snapshots on the original container. Thanks.

Question: if you create a snapshot on a container that is not running, until you boot it up, there is no real snapshot other than a record, on ZFS, is that correct?

tomp · February 3, 2021, 11:17am

Great. You may also use zfs.clone_copy=rebase as a somewhat more storage efficient approach than zfs.clone_copy=false that should still allow you to restore the original snapshots, but without having to fully copy the source image volume on each instance copy.

Well technically ZFS will report a snapshot exists, but as there has been no chance of any writes to the original volume it should theoretically be zero in size. But there may be some ZFS internal usage that means there is some usage reported.

Jimbo · February 3, 2021, 11:22am

Does the rebase mode, somewhat rely upon the original container? I mean if I want to use the clone for a backup, I presume false is better?

tomp · February 3, 2021, 11:31am

My understanding is that it will create a new volume from the source instance volume, but use the source image volume as the basis of the new volume (i.e only the differences between the source instance volume and the source image volume will be copied to the new volume). But after that the new volume isn’t related to the source instance volume.

Jimbo · February 3, 2021, 11:34am

So it is still kind of connected to the original container then, which sounds like could be more head ache down the line?

tomp · February 3, 2021, 11:40am

No not connected to the original container. Only the source image.

Its just at copy time the changes made to the source instance compared to the image are duplicated into the new volume. After that they are not related.

Jimbo · February 3, 2021, 11:46am

Let’s say I create a ubuntu container and then install apache.
Rebase will create a new container using the ubuntu image, and then add the apache difference?
If I set to false, a new image is created with ubuntu+apache combined, which is different to actual process of manually creating.

Is my understanding correct?