How to give LXD container access to (hundreds of) ZFS snapshots


(Ian! D Allen) #1

(I’m sure this question has been answered, but I can’t find the search terms to find the answer.)
I have a ZFS-backed LXD container (Ubuntu 16.04) and I’m creating rolling snapshots of that container.
How can I give the container users access to the rolling snapshots, to recover deleted/changed files?
Depending on how I set up the rolling, there could be hundreds or thousands of snapshots.


(Stéphane Graber) #2

Normally, you can’t. ZFS snapshots cannot be mounted, they can only be restored.

That said, there is a somewhat hidden feature of ZFS which makes snapshots available through a hidden dot directory, you could then pass this path into the container using a disk device.


(Ian! D Allen) #3

I’ve used the hidden .zfs directory in the host. It uses the automounter to mount the snapshots that I access, unmounting them automatically a few minutes later. Is this not “normal” ZFS behaviour? I want that automounting (or something like it) to happen when the snapshots are accessed from inside the container.

Below is what I get using a naive disk mount into the container and then accessing the snapshot directory from inside the container. The access does trigger an automount request on the host, but it fails.

host$ lxc config device add cls ian disk source=/var/snap/lxd/common/lxd/storage-pools/ianzfspool/containers/cls/.zfs/snapshot path=backup/snap

cls$ ls -lap /backup/snap
total 1
drwxrwxrwx 2 nobody nogroup 2 Nov 16 15:45 ./
drwxr-xr-x 3 root   root    3 Nov 17 00:21 ../
drwxrwxrwx 1 nobody nogroup 0 Nov 16 23:51 snapshot-2018-11-15T17:49:35/
drwxrwxrwx 1 nobody nogroup 0 Nov 16 23:51 snapshot-2018-11-15T19:07:11/
drwxrwxrwx 1 nobody nogroup 0 Nov 16 23:51 snapshot-2018-11-16T02:20:13/

cls$ ls -lap /backup/snap/*
/backup/snap/snapshot-2018-11-15T17\:49\:35:
ls: cannot access '/backup/snap/snapshot-2018-11-15T17:49:35/.': Object is remote
ls: cannot access '/backup/snap/snapshot-2018-11-15T17:49:35/..': Object is remote
total 0
d????????? ? ? ? ?            ? ./
d????????? ? ? ? ?            ? ../

/backup/snap/snapshot-2018-11-15T19\:07\:11:
ls: cannot access '/backup/snap/snapshot-2018-11-15T19:07:11/.': Object is remote
ls: cannot access '/backup/snap/snapshot-2018-11-15T19:07:11/..': Object is remote
total 0
d????????? ? ? ? ?            ? ./
d????????? ? ? ? ?            ? ../

/backup/snap/snapshot-2018-11-16T02\:20\:13:
ls: cannot access '/backup/snap/snapshot-2018-11-16T02:20:13/.': Object is remote
ls: cannot access '/backup/snap/snapshot-2018-11-16T02:20:13/..': Object is remote
total 0
d????????? ? ? ? ?            ? ./
d????????? ? ? ? ?            ? ../

host$ fgrep -a WARNING: /var/log/syslog  | tail -n 5                            
Nov 17 00:36:58 ianict kernel: [32063.114339] WARNING: Unable to automount /backup/snap/snapshot-2018-11-15T19:07:11/ianzfspool/containers/cls@snapshot-2018-11-15T19:07:11: 512
Nov 17 00:36:58 ianict kernel: [32063.119388] WARNING: Unable to automount /backup/snap/snapshot-2018-11-15T19:07:11/ianzfspool/containers/cls@snapshot-2018-11-15T19:07:11: 512
Nov 17 00:36:58 ianict kernel: [32063.124395] WARNING: Unable to automount /backup/snap/snapshot-2018-11-16T02:20:13/ianzfspool/containers/cls@snapshot-2018-11-16T02:20:13: 512
Nov 17 00:36:58 ianict kernel: [32063.129498] WARNING: Unable to automount /backup/snap/snapshot-2018-11-16T02:20:13/ianzfspool/containers/cls@snapshot-2018-11-16T02:20:13: 512
Nov 17 00:36:58 ianict kernel: [32063.134478] WARNING: Unable to automount /backup/snap/snapshot-2018-11-16T02:20:13/ianzfspool/containers/cls@snapshot-2018-11-16T02:20:13: 512

If I access the snapshots on the host, they automount correctly:

host# ls -lpd .zfs/snapshot/*/rootfs
drwxr-xr-x 27 296608 296608 29 Nov  1 16:37 .zfs/snapshot/snapshot-2018-11-15T17:49:35/rootfs/
drwxr-xr-x 27 296608 296608 29 Nov  1 16:37 .zfs/snapshot/snapshot-2018-11-15T19:07:11/rootfs/
drwxr-xr-x 27 296608 296608 29 Nov  1 16:37 .zfs/snapshot/snapshot-2018-11-16T02:20:13/rootfs/

host# mount | grep zfs
ianzfspool/containers/cls on /var/snap/lxd/common/lxd/storage-pools/ianzfspool/containers/cls type zfs (rw,xattr,noacl)
ianzfspool/containers/cls@snapshot-2018-11-15T19:07:11 on /var/snap/lxd/common/lxd/storage-pools/ianzfspool/containers/cls/.zfs/snapshot/snapshot-2018-11-15T19:07:11 type zfs (ro,relatime,xattr,noacl)
ianzfspool/containers/cls@snapshot-2018-11-16T02:20:13 on /var/snap/lxd/common/lxd/storage-pools/ianzfspool/containers/cls/.zfs/snapshot/snapshot-2018-11-16T02:20:13 type zfs (ro,relatime,xattr,noacl)
ianzfspool/containers/cls@snapshot-2018-11-15T17:49:35 on /var/snap/lxd/common/lxd/storage-pools/ianzfspool/containers/cls/.zfs/snapshot/snapshot-2018-11-15T17:49:35 type zfs (ro,relatime,xattr,noacl)

and then this is what the mounted snapshots look like from the container:

cls$ ls -lap /backup/snap
total 1
drwxrwxrwx 2 nobody nogroup 2 Nov 16 15:45 ./
drwxr-xr-x 3 root   root    3 Nov 17 00:21 ../
drwxrwxrwx 1 nobody nogroup 0 Nov 16 23:51 snapshot-2018-11-15T17:49:35/
drwxrwxrwx 1 nobody nogroup 0 Nov 16 23:51 snapshot-2018-11-15T19:07:11/
drwxrwxrwx 1 nobody nogroup 0 Nov 16 23:51 snapshot-2018-11-16T02:20:13/

cls$ ls -lap /backup/snap/*
ls: cannot open directory '/backup/snap/snapshot-2018-11-15T17:49:35': Too many levels of symbolic links
ls: cannot open directory '/backup/snap/snapshot-2018-11-15T19:07:11': Too many levels of symbolic links
ls: cannot open directory '/backup/snap/snapshot-2018-11-16T02:20:13': Too many levels of symbolic links

cls$ ls -lapd /backup/snap/snapshot-2018-11-15T17:49:35
drwxrwxrwx 1 nobody nogroup 0 Nov 16 23:51 /backup/snap/snapshot-2018-11-15T17:49:35/

cls$ ls -lap /backup/snap/snapshot-2018-11-15T17:49:35
ls: cannot open directory '/backup/snap/snapshot-2018-11-15T17:49:35': Too many levels of symbolic links

If I remove that disk mount and mount the base ZFS file system, I get the hidden .zfs directory visible inside the container, but again any attempt to access the snapshots causes the host to try and fail to automount them:

host$ lxc config device add cls ian disk source=/var/snap/lxd/common/lxd/storage-pools/ianzfspool/containers/cls path=backup/snap

cls$ ls -l /backup/snap
total 8
-r--------  1 nobody nogroup 9875 Nov 17 01:05 backup.yaml
-rw-r--r--  1 nobody nogroup 1566 Oct 30 06:11 metadata.yaml
drwxr-xr-x 27 root   root      29 Nov 17 01:08 rootfs
drwxr-xr-x 22 root   root      22 Oct 30 04:29 rootfs.ORIG
drwxr-xr-x  2 nobody nogroup    8 Oct 30 06:11 templates

cls$ ls -l /backup/snap/.zfs
total 0
drwxrwxrwx 2 nobody nogroup 2 Nov 16 23:51 shares
drwxrwxrwx 2 nobody nogroup 2 Nov 16 15:45 snapshot

cls$ ls -l /backup/snap/.zfs/snapshot
total 0
drwxrwxrwx 1 nobody nogroup 0 Nov 16 23:51 snapshot-2018-11-15T17:49:35
drwxrwxrwx 1 nobody nogroup 0 Nov 16 23:51 snapshot-2018-11-15T19:07:11
drwxrwxrwx 1 nobody nogroup 0 Nov 16 23:51 snapshot-2018-11-16T02:20:13

cls$ ls -l /backup/snap/.zfs/snapshot/*
/backup/snap/.zfs/snapshot/snapshot-2018-11-15T17:49:35:
total 0
/backup/snap/.zfs/snapshot/snapshot-2018-11-15T19:07:11:
total 0
/backup/snap/.zfs/snapshot/snapshot-2018-11-16T02:20:13:
total 0

cls$ ls -la /backup/snap/.zfs/snapshot/*
/backup/snap/.zfs/snapshot/snapshot-2018-11-15T17:49:35:
ls: cannot access '/backup/snap/.zfs/snapshot/snapshot-2018-11-15T17:49:35/.': Object is remote
ls: cannot access '/backup/snap/.zfs/snapshot/snapshot-2018-11-15T17:49:35/..': Object is remote
total 0
d????????? ? ? ? ?            ? .
d????????? ? ? ? ?            ? ..

/backup/snap/.zfs/snapshot/snapshot-2018-11-15T19:07:11:
ls: cannot access '/backup/snap/.zfs/snapshot/snapshot-2018-11-15T19:07:11/.': Object is remote
ls: cannot access '/backup/snap/.zfs/snapshot/snapshot-2018-11-15T19:07:11/..': Object is remote
total 0
d????????? ? ? ? ?            ? .
d????????? ? ? ? ?            ? ..

/backup/snap/.zfs/snapshot/snapshot-2018-11-16T02:20:13:
ls: cannot access '/backup/snap/.zfs/snapshot/snapshot-2018-11-16T02:20:13/.': Object is remote
ls: cannot access '/backup/snap/.zfs/snapshot/snapshot-2018-11-16T02:20:13/..': Object is remote
total 0
d????????? ? ? ? ?            ? .
d????????? ? ? ? ?            ? ..

host$ fgrep -a WARNING: /var/log/syslog  | tail -n 5
Nov 17 01:22:30 ianict kernel: [34795.097976] WARNING: Unable to automount /backup/snap/.zfs/snapshot/snapshot-2018-11-15T19:07:11/ianzfspool/containers/cls@snapshot-2018-11-15T19:07:11: 512
Nov 17 01:22:30 ianict kernel: [34795.102790] WARNING: Unable to automount /backup/snap/.zfs/snapshot/snapshot-2018-11-15T19:07:11/ianzfspool/containers/cls@snapshot-2018-11-15T19:07:11: 512
Nov 17 01:22:30 ianict kernel: [34795.107690] WARNING: Unable to automount /backup/snap/.zfs/snapshot/snapshot-2018-11-16T02:20:13/ianzfspool/containers/cls@snapshot-2018-11-16T02:20:13: 512
Nov 17 01:22:30 ianict kernel: [34795.112426] WARNING: Unable to automount /backup/snap/.zfs/snapshot/snapshot-2018-11-16T02:20:13/ianzfspool/containers/cls@snapshot-2018-11-16T02:20:13: 512
Nov 17 01:22:30 ianict kernel: [34795.117067] WARNING: Unable to automount /backup/snap/.zfs/snapshot/snapshot-2018-11-16T02:20:13/ianzfspool/containers/cls@snapshot-2018-11-16T02:20:13: 512

If I go back to the host and get the host to automount all the snapshots, I then get this error inside the container again:

cls$ ls -la /backup/snap/.zfs/snapshot/*
ls: cannot open directory '/backup/snap/.zfs/snapshot/snapshot-2018-11-15T17:49:35': Too many levels of symbolic links
ls: cannot open directory '/backup/snap/.zfs/snapshot/snapshot-2018-11-15T19:07:11': Too many levels of symbolic links
ls: cannot open directory '/backup/snap/.zfs/snapshot/snapshot-2018-11-16T02:20:13': Too many levels of symbolic links

If I remove that disk mount and try to mount the rootfs of one of the snapshots, it also fails:

host$ lxc config edit cls                                                             
Config parsing error: Add disk devices: Failed to setup device: Unable to mount 
/var/snap/lxd/common/lxd/storage-pools/ianzfspool/containers/cls/.zfs/snapshot/snapshot-2018-11-15T17:49:35/rootfs at /var/snap/lxd/common/lxd/devices/cls/disk.
ian.backup-snap: too many levels of symbolic links                              
Press enter to start the editor again                                           

And now something is very broken because I can’t even delete that disk from the config. I tried removing and changing it using edit and it said invalid argument and refused. I broke out and tried the command line, and that also fails:

host$ lxc config device remove cls ian
Error: invalid argument

I had to stop the container, use lxc to remove the disk item, and restart it. I still can’t add the snapshot rootfs to the container:

host$ lxc config device add cls ian disk source=/var/snap/lxd/common/lxd/storage-pools/ianzfspool/containers/cls/.zfs/snapshot/snapshot-2018-11-15T17:49:35/rootfs path=backup/snap
Error: Add disk devices: Failed to setup device: Unable to mount /var/snap/lxd/common/lxd/storage-pools/ianzfspool/containers/cls/.zfs/snapshot/snapshot-2018-11-15T17:49:35/rootfs at /var/snap/lxd/common/lxd/devices/cls/disk.ian.backup-snap: too many levels of symbolic links

(Ian! D Allen) #4

Maybe I need to talk to the ZFS people about how to do this?


(Stéphane Graber) #5

Yeah, you may need to chat with the ZFS people…
I know there are some people looking into making ZFS datasets available to containers, which will likely help with this quite a bit, but we’re likely still years away from a full solution (like the solaris zones support) there…


#6

I am able to mount ZFS snapshots:

sudo zfs snapshot z/a@x
sudo mount -t zfs z/a/@x /mnt/x

result: /mnt/x is mounted with the snapshot named “x”.
However, now I can’t access the snapshot from the hidden .zfs directory:
cd /z/a/.zfs/snapshot/x
bash: cd: /z/a/.zfs/snapshot/x: Too many levels of symbolic links


(Stéphane Graber) #7

Using sudo mount -t zfs bypasses many of the safety checks that the ZFS tools put in place, the .zfs directory is also a bit of a hack as far as ZFS is concerned and isn’t part of the guaranteed ZFS feature set.