ckruijntjens,
From what I can tell and I am self taught (so “grain of salt” this info), you appear to have a few things going on in ZFS that unless there is a very specific use case for doing them then you might not want to be doing them. All these instructions pertain to the host.
TL;DR
Remove layered ext4/zfs by backing up/migrating containers, then wipe ZFS pool drives, re-establish ZFS drives through LXD, then in ZFS set L2ARC, compression, scrub frequency, and create a SLOG.
Discussion
Just like we can tune our containers through their config
for their use case, we can ‘tune’ our zpool(s) to their use case. For instance, if you had two pools which the first, consisted of NVMEs then you could tune it for running containers/VMs and the other consisted of SSDs then you could tune it for storage. By tweaking the zpool settings you can optimize performance according to what is in the containers. You stated in your “PS” that you have allocated four SSDs to one zpool, which is good because ZFS prefers whole disks versus partitions. IMO, OSs and storage should be on different pools, but I will try to address things generally.
SSDs Tip
Make sure that you are running trim regularly on your SSDs in order to keep them in tip top shape. While SSDs are performing trim functions it will hurt drive (and by consequence ZFS/LXD/etc) performance until complete. Set trim to run when the drives are most likely to not be used.
ZFS Housekeeping Tip
With SDDs and your usage, you’re likely not experiencing anything running slow due to ‘waste’ but it’s good practice to implement scrubbing, like trim above, as soon as possible to keep your drives from getting cluttered. To clean up your zpool you can safely run:
sudo zpool scrub <name of lxd zpool>
NOTE: This is somewhat-akin to an ‘empty trash’ feature so it’s pretty permanent. Also, you can set zpool scrub to run automatically; checkout the zpool-scrub manpage for details.
ZFS Compression
If you run: zfs get all <name of lxd zpool> | grep compress
you’ll see (same as below) and it means that you are not running any compression on your zpool which is most likely not good.
lxdpool compressratio 1.00x
lxdpool compression off default
lxdpool refcompressratio 1.00x -
ZFS is optimized through compression, while at first it might seem like backwards thinking because compression causes cycles, the reality is that in nearly every x86_64 use case not using compression will actually slow ZFS down. A use case to not use compression would be if you had a single core atom processor running your network router and you wanted to add some disks which used ZFS for the filesystem, then you may not want use ZFS compression in that kind of instance due to the CPU overhead required compared to the importance of routing traffic.
ZFS uses compression algorithms to move files i/o ARCs and Storage. By default, ZFS typically uses zstd-1
, but imho lz4
optimizes the system better. Here is an article discussing this topic better than I can; and the people at OpenZFS seems to support the same conclusion. You’re running ZFS v2.0.3-9 so lz4 compression is an option for you. Assuming that you want to use lz4
compression you can switch it on with:
sudo zfs set compression=lz4 <name of lxd zpool>
The compression will only impact new IO activities on the zpool; which means that you’re older containers will need to be migrated off from the zpool and then back onto it in order to take advantage of lz4 compression. I have more to say on migration in ZFS Storage section below.
ZFS Cache
ARC status: HEALTHY
ARC size (current): 27.9 % 17.5 GiB
Target size (adaptive): 28.0 % 17.5 GiB
Min size (hard limit): 6.2 % 3.9 GiB
Max size (high water): 16:1 62.8 GiB
Cache hit ratio: 93.9 % 764.4M
Cache miss ratio: 6.1 % 49.8M
Actual hit ratio (MFU + MRU hits): 93.6 % 761.9M
Data demand efficiency: 99.0 % 509.9M
ZFS is awesome but it requires a lot of RAM; a general rule regardless of parity settings is that you make at least 1GB of RAM available for every TB of disk storage; so if you had four 10TB drives for ZFS, regardless of your RAID settings; you would have an additional 40GB of RAM just for ZFS to play with. So if your system OS required 4GB of RAM then you have to have at least 44GB of RAM in total. Ideally, you’re using ECC-RAM, which is an added layer of insurance.
ZFS ARC
ZFS cache is called ARC, and it runs in RAM, when you exceed the limitations of ARC then it’s off to ZIL which is stored on disk. You are going to want to compare your max ARC size to your available RAM to the total TB of disk storage on your zpool.
Your ARC ratio is 93.9% which means that you are using ZIL the rest of the time; ideally you want a ratio as close to 100% as possible. If your RAM is maxed out and you’re still running slow, then you may want to setup a SLOG device. It only impacts system performance when you are asynchronously using ZIL.
By default, ZFS sets the ARC Min to 1/32 of system RAM; this means on a 128GB RAM system, about a 4GB minimum has been established. Your system is set to: 3.9GB.
By default, ZFS sets the ARC Max to 1/2 of system RAM; this means on a 128GB or so of RAM, about 64GB has been established. Your system is set to: 62.8GB.
I suspect that you have 128GB of RAM on your system; if that is the case then unless your four SSDs are greater than 15GB each, which I doubt, then that should be plenty. If you have less than 128GB RAM then you are configured incorrectly. ARC settings are in bytes, so some math is involved. You can safely and temporarily reconfigure your ARC setting by using the following (where X is the number of GBs that you want):
ARC Min
sudo echo "$[X * 1024*1024*1024]" >/sys/module/zfs/parameters/zfs_arc_min
ARC Max
sudo echo "$[X * 1024*1024*1024]" >/sys/module/zfs/parameters/zfs_arc_max
Once you establish your optimal ARC Settings. To permanently set them from the temp settings you can run the following:
sudo touch /etc/modprobe.d/zfs.conf
arcmin=$(cat /sys/module/zfs/parameters/zfs_arc_min)
arcmax=$(cat /sys/module/zfs/parameters/zfs_arc_max)
sudo echo "options zfs zfs_arc_min ${arcmin}" >> /etc/modprobe.d/zfs.conf
sudo echo "options zfs zfs_arc_max ${arcmax}" >> /etc/modprobe.d/zfs.conf
If you are running ZFS on your root directory you will have to run: update-initramfs -u -k all
and reboot your system.
NOTE: the difference between zfs_arc_max and zfs_arc_min has to be greater than 10% for L2ARC to work correctly.
ZFS L2ARC
You’re not running L2ARC, it is totally recommended from a performance perspective. You can use a more performant SSD or NVME partition for L2ARC:
sudo zpool add <name of lxd pool> cache <l2arc device>
ZFS Storage
ZIL and SLOG are between the ARC/L2ARC and the zpool(s).
ZFS SLOG
ZFS SLOG is cache beyond the default ZIL. Running a SLOG drive increases asynchronous performance across ZIL. If you have a drive to spare then:
sudo zpool add <name of lxd pool> cache <slog device>
NOTE: you can use a partition of a drive or a partition on a parity.
Layering Filesystems
The OP posted that he is layering ZFS on top of Ext4, and you can technically do that, but IMO it is usually unnecessary and it’s going to hurt performance tremendously; because, well it’s two separate file systems for a single file; everything gets all wonky. I cannot think of a use case where running one file system on top of another is beneficial from an optimization perspective then just running Ext4 or ZFS.
If your containers matter to you, your very best bet is to migrate your containers to a whole new temporary device, perhaps the drive that you intend on using for SLOG. Once your confident that you have safely backed up your containers, destroy the ZFS pool, wipe the drives completely. Then remount them and re-implement ZFS storage through LXD. You can do the LXD storage setup via lxd init
or you can use this tutorial make sure to click the “ZFS” directions. I only mention that because I was pretty slow on that, lol. Also, I was aided by this info. You can add an existing ZFS storage to LXD but it really is easier if you are starting over to implement through LXD.
Once you’re done rebuilding your zpools through LXD, go back and set your compression to your ideal algorithm, set your arc values, and tell your system to scrub your lxd zpool regularly @weekly (or whatever you want), and set up your SLOG through ZFS. Your zpool ought to be optimized for better performance after these tweaks.
Good Luck!