ZFS Backup Restore Error Cannot write: No space left on device

I am currently testing backup and restore with large amounts of data, and ran into an issue on ZFS, surprisingly BTRFS did not give me the same problem.

Steps:

  1. create Ubuntu instance
  2. Send a patch request so that 25GB limit is set on the HD.
  3. Create fake data of 20GB.
$ head -c 20GB /dev/zero > data.iso
  1. Created a snapshot, LXD reports this as 6.83 MB
  2. Create a backup (non optimised, container + snapshots)

When trying to restore the backup

Create instance from backup: Error starting unpack: Failed to run: tar -zxf - --xattrs-include=* -C /var/snap/lxd/common/lxd/storage-pools/default/containers/backup-test --strip-components=2 backup/container: tar: rootfs/root/data.iso: Cannot write: No space left on device
tar: rootfs/root/data.iso: Cannot utime: No space left on device
...

So I can’t restore a backup even though the restored size is less than the quota limit.

Hi,

Please can you show reproducer commands. Thanks

Also, have you tried increasing the volume.size limit on the storage pool you’re importing into? It defaults to 10GB if not set for block volumes.

See Specify profile when restoring backup

I tried the same test on a new instance but the quota twice as much as the fake data, and it worked. so it was 15GB/30GB.

I did through the API, tomorrow i will workout the command line equivalent and post here.

Im not really understanding what you mean or what isn’t working so a full reproducer would be good. Thanks

I tried 15GB fake data and it was fine, provided that made the quota much larger.

Spoiler: If you don’t create a snapshot in this test, then the problem does not occur. As i am not deleting the ISO after the backup or snpashot, the snapshot remains at 3.8mb (if i delete it will go to the size of 20GB or something).

Send a post request to /instances (i have included profile info at the bottom, but not really relevant).

{
    "profiles": [
        "custom-default",
        "custom-nat"
    ],
    "config": {
        "limits.memory": "1GB",
        "limits.cpu": "1"
    },
    "name": "ubuntu-test",
    "source": {
        "type": "image",
        "fingerprint": "cab177ff192c5fcf8342f1433a8a4f2baaf796085598d7650999c23d5846a33b"
    }
} 

Then send a patch request, with this info changed to set the limit on the hard disk

 "devices": {
        "root": {
            "path": "/",
            "pool": "default",
            "type": "disk",
            "size": "25GB"
        }
    },

Inside the container I run head -c 20GB /dev/zero > data.iso

Now lets check ZFS

$ sudo zfs list
NAME                                                                              USED  AVAIL     REFER  MOUNTPOINT
lxdpool                                                                          26.6G  24.7G       96K  none
lxdpool/containers/ubuntu-test                                                   19.1G  4.20G     19.1G  /var/snap/lxd/common/lxd/storage-pools/default/containers/ubuntu-test

Create a snapshot, send a post request to /instances/ubuntu-test/snapshots

{
    "stateful": false,
    "name": "ubuntu-test-20210218-01"
} 

Give ZFS a minute or two from the the initial record creation, initially the snapshot will be 86k or something, after about a minute it will get to about 3.8MB.

To create a backup I send a post request to /instances/ubuntu-test/backup

{
    "name": "ubuntu-test-20210218",
    "expiry": null,
    "instance_only": false,
    "optimized_storage": false
}

Restoring the backup

I rename the existing instance by sending a post request to /instances/ubuntu-test

{
    "name": "ubuntu-test-backup"
} 

I then export the backup by sending a get request to /instances/ubuntu-test-backup/backups/ubuntu-test-20210218/export

I then post the tarball back to /instances and wait for the response to complete, which failed.

{
    "type": "sync",
    "status": "Success",
    "status_code": 200,
    "operation": "",
    "error_code": 0,
    "error": "",
    "metadata": {
        "id": "0d8e46e5-8013-4919-bbc9-37a685caab6d",
        "class": "task",
        "description": "Restoring backup",
        "created_at": "2021-02-18T09:48:12.824031955Z",
        "updated_at": "2021-02-18T09:48:12.824031955Z",
        "status": "Failure",
        "status_code": 400,
        "resources": {
            "containers": [
                "/1.0/containers/ubuntu-test"
            ],
            "instances": [
                "/1.0/instances/ubuntu-test"
            ]
        },
        "metadata": null,
        "may_cancel": false,
        "err": "Create instance from backup: Error starting unpack: Failed to run: tar -zxf - --xattrs-include=* -C /var/snap/lxd/common/lxd/storage-pools/default/containers/ubuntu-test --strip-components=2 backup/container: tar: rootfs/root/data.iso: Cannot write: No space left on device\ntar: rootfs/root/data.iso: Cannot utime: No space left on device\n\ntar: Exiting with failure status due to previous errors",
        "location": "none"
    }
}

My profiles are

$ lxc profile show custom-default
config: {}
description: ""
devices:
  root:
    path: /
    pool: default
    type: disk
name: custom-default
used_by:
- /1.0/instances/mysql
- /1.0/instances/redis
- /1.0/instances/postgres
- /1.0/instances/mariadb
- /1.0/instances/ubuntu-test
$ lxc profile show custom-nat
config: {}
description: Custom NAT Network Profile
devices:
  eth0:
    name: eth0
    nictype: bridged
    parent: custombr0
    type: nic
name: custom-nat
used_by:
- /1.0/instances/mysql
- /1.0/instances/redis
- /1.0/instances/postgres
- /1.0/instances/mariadb
- /1.0/instances/ubuntu-test

I had to truncate err because discourse would not let me submit, if you need me to send by email I can.

I just tried a simplified version of this to see if I can get a baseline reproducer and have been unsuccessful in doing so, can you repeat these steps and see if you get the same problem please:

lxc storage create zfs zfs size=40GB
lxc launch images:ubuntu/focal c1 -s zfs
lxc config device set c1 root size=25GB
sudo zfs get quota zfs/containers/c1
NAME               PROPERTY  VALUE  SOURCE
zfs/containers/c1  quota     23.3G  local

lxc shell c1
 head -c 20GB /dev/zero > data.iso
 exit

lxc snapshot c1
lxc export c1 /home/user/c1.tar.gz
lxc stop c1
lxc rename c1 c1orig
lxc import /home/user/c1.tar.gz

sudo zfs get quota zfs/containers/c1
NAME               PROPERTY  VALUE  SOURCE
zfs/containers/c1  quota     23.3G  local

sudo zfs get quota zfs/containers/c1orig
NAME                   PROPERTY  VALUE  SOURCE
zfs/containers/c1orig  quota     23.3G  local

sudo zfs list
NAME                                                                          USED  AVAIL     REFER  MOUNTPOINT
zfs                                                                           630M  35.2G       24K  none
zfs/containers                                                                420M  35.2G       24K  none
zfs/containers/c1                                                             417M  22.9G      208M  /var/snap/lxd/common/lxd/storage-pools/zfs/containers/c1
zfs/containers/c1orig                                                        3.37M  23.3G      208M  /var/snap/lxd/common/lxd/storage-pools/zfs/containers/c1orig
zfs/custom                                                                     24K  35.2G       24K  none
zfs/deleted                                                                   120K  35.2G       24K  none
zfs/deleted/containers                                                         24K  35.2G       24K  none
zfs/deleted/custom                                                             24K  35.2G       24K  none
zfs/deleted/images                                                             24K  35.2G       24K  none
zfs/deleted/virtual-machines                                                   24K  35.2G       24K  none
zfs/images                                                                    208M  35.2G       24K  none
zfs/images/bf4990388b3470f956a4db521c01f3b928a68cf46eb220c8f02dcbc8d9d586ad   208M  35.2G      208M  /var/snap/lxd/common/lxd/storage-pools/zfs/images/bf4990388b3470f956a4db521c01f3b928a68cf46eb220c8f02dcbc8d9d586ad
zfs/virtual-machines                                                           24K  35.2G       24K  none

sudo zfs list -t snapshot
NAME                                                                                   USED  AVAIL     REFER  MOUNTPOINT
zfs/containers/c1@snapshot-snap0                                                       208M      -      208M  -
zfs/containers/c1orig@snapshot-snap0                                                   338K      -      208M  -
zfs/images/bf4990388b3470f956a4db521c01f3b928a68cf46eb220c8f02dcbc8d9d586ad@readonly     0B      -      208M  -

Also, what size is your ZFS storage pool?

53.5GB is the size.

You need to adjust your commands so that after taking the snapshot, you create a backup of the container using the following settings, then import that backup (not snapshot).

{
    "name": "ubuntu-test-20210218",
    "expiry": null,
    "instance_only": false,
    "optimized_storage": false
}

Note: this problem occurs when you use a backup that has snapshots.

Ahh, I think maybe it is a disk space issue on my side, as i have a backup of the container.

NAME USED AVAIL REFER MOUNTPOINT
lxdpool 26.6G 24.7G 96K none
lxdpool/containers 23.5G 24.7G 96K none

Going to test now with a smaller size.

The command lxc export c1 /home/user/c1.tar.gz creates a non-optimised backup including snapshots and exports it to a tarball, and then deletes the local backup in one-shot. So should be equivalent.

You can see that my backup has a snapshot in it, as after it has been reimported you can see the snapshot in the 2nd zfs list command I posted.

You are correct the problem was storage space on the ZFS pool, I can’t belive i missed that, my brain was in testing mode, i was testing different servers and storage pools, it just so happens my ZFS server has some instances on it as well which i use for development., so the storage usage was different.

1 Like