Tl;dr - zfs deduplication doesn’t seem to work well with Incus 6.8, but only because Incus incorrectly sees the disk as being “full-up” when, according to zfs, it is not, it’s just got a high deduplication factor. Is that fixable in Incus (or more likely, have I just not configured something in incus correctly?); or is dedup simply not usable under Incus in this manner?
==================== Details ========================
I have been experimenting with using deduplication in zfs on some of my backup pools, to create more convenient solutions for accessing recent archived copies of data in large data blocks that don’t change much from backup to backup. I don’t use this for my daily driver, but I wanted to check out myself how well it works. I know I can use snapshots, but I wanted to try out dedup and use a different strategy for creating more convenient solutions for my use-case.
I created a data set on a separate 2TB disk. Something like:-
andrew@lando:~$ sudo zfs create nvme2/dedup
andrew@lando:~$ sudo zfs set dedup=yes nvme2/dedup
andrew@lando:~$ sudo zfs get dedup nvme2/dedup
NAME PROPERTY VALUE SOURCE
nvme2/dedup dedup on local
I set this as a pool called dedup, and gave it the zfs dataset as source backing. And it works as advertised - at least at the zfs level: I throw a ~300GiB archive (four containers per project) at it, and it shows ~300G used. I throw another copy on it in a convenient archive-named project, and it shows little more storage used, but not much since ~all the data can deduplicate under zfs. The deduplication process is a little slower, but I have substantial server resources in several EPYC CPU systems, so they can handle the crunching without breaking a sweat. Here’s the pool after four “copies”:
andrew@lando:~$ zpool list
NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT
nvme 5.45T 1.64T 3.81T - - 14% 30% 1.00x ONLINE -
nvme2 1.81T 302G 1.52T - - 4% 16% 4.17x ONLINE -
Very happy with zfs. It works as I expected.
Incus is happy in most ways too, it sees my four projects separately, and it correctly shows the containers inside: here’s one of the projects, called ‘Friday’:
andrew@lando:~$ incus list -c nbDs4Sle
+-----------------------+--------------+------------+---------+------+-----------+----------------------+---------+
| NAME | STORAGE POOL | DISK USAGE | STATE | IPV4 | SNAPSHOTS | LAST USED AT | PROJECT |
+-----------------------+--------------+------------+---------+------+-----------+----------------------+---------+
| Fastcloud-Friday | dedup | 300.65GiB | STOPPED | | 3 | 1970/01/01 01:00 BST | Friday |
+-----------------------+--------------+------------+---------+------+-----------+----------------------+---------+
| OGSelfHosting-Friday | dedup | 5.37GiB | STOPPED | | 8 | 1970/01/01 01:00 BST | Friday |
+-----------------------+--------------+------------+---------+------+-----------+----------------------+---------+
| SysAdmin-22-04-Friday | dedup | 4.66GiB | STOPPED | | 10 | 1970/01/01 01:00 BST | Friday |
+-----------------------+--------------+------------+---------+------+-----------+----------------------+---------+
| c1-Friday | dedup | 17.00KiB | STOPPED | | 0 | 1970/01/01 01:00 BST | Friday |
+-----------------------+--------------+------------+---------+------+-----------+----------------------+---------+
I have a bit of an issue with the “last used” date, but technically these have not been used. It resets as soon as I run any container - that’s perfect and thus not my concern. The containers can be started/stopped/accessed so I can retrieve or inspect. That part is all great as usual.
The zpool itself has actually used still around 300G per zfs, so LOTS of free usable space on it - even without deduplication, even though there are four sets of (~the-same) data on it. Dedup factor is HIGH (x4.17, per the above), basically showing significant deduplication, as expected.
But from a storage perspective, Incus sees this very differently and this is where it breaks:
andrew@lando:~$ incus storage info dedup
info:
description: Dedup storage pool
driver: zfs
name: dedup
space used: 1.22TiB
total space: 2.68TiB
used by:
images:
- 1f684cd29012a832262b5ba5f6d72060f4c20975ba571ab78c60331e99daa9db (project "reserve")
- d57ccafc3f99e243aaadcbb7dbeea22af4ecd15e4a5df5957bff5af5837245bc (project "reserve")
instances:
- Fastcloud-Friday (project "Friday")
- Fastcloud-Saturday (project "Saturday")
- Fastcloud-Week-00 (project "Week-00")
- Fastcloud (project "reserve")
- OGSelfHosting-Friday (project "Friday")
- OGSelfHosting-Saturday (project "Saturday")
- OGSelfHosting-Week-00 (project "Week-00")
- OGSelfHosting (project "reserve")
- SysAdmin-22-04-Friday (project "Friday")
- SysAdmin-22-04-Saturday (project "Saturday")
- SysAdmin-22-04-Week-00 (project "Week-00")
- SysAdmin-22-04 (project "reserve")
- c1-Friday (project "Friday")
- c1-Week-00 (project "Week-00")
- c1 (project "reserve")
profiles:
- br0 (project "Saturday")
- br0 (project "Week-00")
- br0 (project "reserve")
- default (project "Friday")
- default (project "Saturday")
- default (project "Week-00")
- default (project "reserve")
Note the SIZE. Incus is picking up the cumulate size (1.2TB). And it stops copying when it senses “disk full”. Meanwhile zfs says “lots of space left”.
Is there a config I am missing?
I tried this on two different systems (I had time, and it was fun experimenting) - same result. I can maybe work around this by creating a separate zfs dataset for each deduplicated storage pool - basically tricking Incus into thinking it’s a series of disks and creating pool dedup1, dedup2, … dedupn all backed by the same actual disk - that might work, but it’s a little clumsy. So before I do that, I wanted to see if there’s a better fix - a missing config or some other operator error as usual?
Happy Saturday and even Happier New Year!
Andrew